repo:purge_cache doesn't purge all files

edmorley commented 7 years ago

The repo:purge_cache command doesn't actually remove all files from the cache, but tries to preserve anything under vendor/heroku/: https://github.com/heroku/heroku-repo/blob/3e7167d79ca06f8be9cc4b084f7ca03d5adaee46/commands/purge_cache.js#L21-L36

This was added in: https://github.com/heroku/heroku-repo/commit/16b17f3b7e5952d8a2393b7601d108db8ba9ac63

I believe the intention was to preserve the implicit previous Ruby version, for apps that don't define a specific required version.

However this seems flawed/fragile for a few reasons: 1) There isn't a command to unconditionally delete all files for users who really want to. 2) It's likely confusing to the user if there's still hidden state preserved after clearing the cache (eg think of the use case of "why doesn't my new stage app or review apps work, my prod app still works after clearing the cache") 3) It only helps Ruby apps, and even then only so long as the path for the Ruby metadata doesn't change from vendor/heroku/. (Each buildpack does it's own thing, for example Python uses the .heroku/python directory instead) 4) Even if the language runtime version is preserved, it doesn't help the user if they haven't pinned eg package deps properly etc.

As such I would propose either: 1) Scrapping the partial cache clearing version of the command entirely, and just clearing the whole cache. 2) Offering two commands - one that clears the whole cache, and one that tries to preserve certain files. The latter command should be improved to also save the Python/other buildpacks' metadata too.

tt commented 7 years ago

I strongly agree with this.

The partial purge is odd. It's there so customers who don't specify a Ruby version will get the same after purging their cache but we should really warn them that this is a consequence of purging the cache and not specifying a Ruby version instead of trying to magically save them.

Using the cache as a store also breaks apart when used with review apps and pipelines where each app has a different version of the cache.

dmathieu commented 7 years ago

Definitely 💯

edmorley commented 7 years ago

we should really warn them that this is a consequence of purging the cache and not specifying a Ruby version instead of trying to magically save them

I've filed heroku/heroku-buildpack-ruby#611 for this.

Not that it need block this issue as filed, but I guess all buildpacks should really be doing similar too (I've filed heroku/heroku-buildpack-python#440 for the Python buildpack, but others may need a warning).

schneems commented 7 years ago

To get rid of this codon needs to provide us a durable store.

dmathieu commented 7 years ago

What do you mean "a durable store"? You want a second, separate and unpurgeable cache? Not being able to fully clear a cache really looks like a hack, and should be removed. Introducing a second cache system to mitigate that hack would really be a bad solution.

schneems commented 7 years ago

What do you mean "a durable store"

A cache is not persistent, it can be lossy. A cached value implies a value that can be re-generated, it's not the canonical representation of the data.

By "durable" I mean something that endures. It is is not lossy. A safe place to put canonical data.

In this case, the only representation of this data that we have is in the build cache. I didn't like it when we implemented it, I don't like it now, but it's what we've got.

Here's the history:

About 4-5 years ago Ruby came to build and said "we need a durable store". They talked about it and said "no" because it was too much engineering overhead. But it was decided to "use the cache anyway and treat it like a durable store", because it effectively is.

So we did. It was one of those "temporary hacks" that was supposed to get replaced when build "had time" in the future to build something if it "was needed".

Using the cache has worked well. To my knowledge, in the last 5 years there's been no cases of losing data unexpectedly, or having data that's not "cleared" in the cache cause any kind of deploy problems. Basically there's not ever been a reason to move it other than it seems bad to keep canonical data in a cache.

If we get rid of this functionality it will break a lot about how the buildpack works. We heavily rely on this tactic not just for storing ruby versions, but for other things as well. For example we need a way to determine if a deploy is the first deploy or not, we use the cache for this. We need to store prior versions of bundler used. We need to store prior stack name so we can migrate data on stack change.

We rely on this functionality. If it goes away, we need to replace it with something compatible.

dmathieu commented 7 years ago

@jbyrum ^

heroku / heroku-repo

repo:purge_cache doesn't purge all files #70