Open edmorley opened 3 years ago
Related support ticket 958280
I looked into this heavily for https://github.com/heroku/heroku-buildpack-ruby/issues/1118
The cache is only used at bundle install
time. And it appears that it's only used when dependencies are not satisfied. I.e. if you deploy to heroku, then do a heroku run bash
followed by bundle install
it won't download the cache for "reasons". I'm assuming the reason is that it's dependencies are already satisfied. However I'm not totally sure of the behavior.
On first bundle install the cache is downloaded and written here https://github.com/rubygems/rubygems/blob/be08d8307eda3b61f0ec0460fe7fbcf647b526e6/bundler/lib/bundler/compact_index_client/updater.rb#L64
Where local_path
is something in ~/.bundle/cache/compact_index
. The path name includes an etag of the compact index. Before downloading a new index bundler will check to see if a prior index's etag is satisfactory.
Based on this it seems that making these files available at runtime add nothing (because people don't bundle install
at runtime) so they could be stripped out before launch.
The other question is: Is it helpful to preserve these between deploys? It depends on how frequently the etag is invalidated. @hone knows more about the whole compact index so he might have some insight. My very unscientific attempt to answer this question was to deploy an app to heroku today and see if it has the same etag or not:
/app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions
./app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions
.So it looks like there may be some benefit to keeping them around. I think it's worth benchmarking the download of the index. If it's already on us-east and coming from S3 then there's no speed benefit from putting it in the cache. For CNB where there's the local install case to think about, it's likely a good idea to cache it (even if it's fast).
This does make me vaguely wish there was some kind of cross-app cache or mechanism since it seems wasteful to duplicate this across N caches (where N is number of apps on the platform).
Etag still valid:
remote: `/app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions`.
Etag still valid
remote: `/app/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/versions`.
I'm unsure if this also happens in the CNB as well. Need to investigate if this is still an issue or not
I'm currently auditing official/popular buildpacks for compatibility with potentially changing the build directory to
/app
in the future.One of the potential source of problems for such a move, is that files written to
/app
(or to$HOME
, which is/app
during the build) will now be included in the slug, when previously they were not. As such, I'm checking what files are left behind by buildpacks in/app
, using this buildpack which lists the contents of/app
at build time: https://github.com/edmorley/heroku-buildpack-list-app-dirTesting the Ruby getting started guide with the Ruby buildpack + the above buildpack, I see that the bundler cache (
~/.bundle/cache/compact_index
) is being written to/app
. Once the build directory is/app
, this would cause the slug size to increase, potentially pushing apps closer to the limit. For the getting started guide this cache is only 18MB, but it doesn't have as many dependencies as some typical Rails apps.It seems there are few options:
$CACHE_DIR
instead/tmp
instead of$HOME
, or else (b) delete it from$HOME
at the end of the build