Leafwing-Studios / cargo-cache

Cache your Cargo build files
Apache License 2.0
8 stars 2 forks source link

Remove unused dependencies before saving cache #24

Closed benfrankel closed 2 weeks ago

benfrankel commented 1 month ago

Every time cargo-cache falls back on an older cache, it may pull in cargo cache files that are no longer needed, e.g. for an older version of a dependency after a cargo update. The stale cargo cache files get mixed in with the fresh cargo cache files produced by cargo during the workflow job, which progressively balloons the cache size each time this happens (until the next Rust update comes along and all the caches automatically reset because they can't fall back).

Increasing cache size means increasing download and upload time, and quickly running out of the 10GB cache storage space provided by GitHub. A user would have to recognize that this issue is occurring and know to manually delete all the caches periodically to work around this.

This was mitigated for a particularly bad case in https://github.com/Leafwing-Studios/cargo-cache/issues/22, but the root cause has not yet been fixed.

The ideal solution would be to fall back on an older cache, run the workflow job, then delete any cargo cache files that went unused during the workflow job before saving the cache.

There's an unstable cargo feature that may help with this: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#manual-garbage-collection-with-cargo-clean.

TimJentzsch commented 1 month ago

I'm not sure if this can easily be integrated in this action. Ideally, this would need to run in a post step, at the end of the workflow.

But the current composite action format we use doesn't allow to define custom post actions, AFAIK.

To support this we would probably need to implement a fully custom action, which would give us more flexibility, but also increase complexity (and make it harder to audit the action).

On the bright side: This feature doesn't require direct support from our action. In theory you can manually configure a step at the end of your job to run this command, and it will reduce the cache size.

Perhaps we can prototype how this would influence the cache size in bevy_quickstart or some other repository :)

BD103 commented 1 month ago

Hey, look! I can contribute to this conversation!

Today I've been working on cargo-sweep, an action that integrates with cargo-sweep. I've been using it to automatically delete files that haven't been accessed in the workflow. It does exactly as @TimJentzsch mentioned, and is pretty effective from my testing.

TimJentzsch commented 1 month ago

Hey, look! I can contribute to this conversation!

Today I've been working on cargo-sweep, an action that integrates with cargo-sweep. I've been using it to automatically delete files that haven't been accessed in the workflow. It does exactly as @TimJentzsch mentioned, and is pretty effective from my testing.

This is interesting. Since it runs in a post step, we could integrage it directly into cargo-cache, probably with an option to disable it.

Still makes it difficult to audit (an additional action and a cargo binary), but might be worth it to make the caching more effective

BD103 commented 1 month ago

Yeah, disabled by default is what I was thinking. I'm having some trouble right now, but I can probably get it finished by this weekend and open a PR. :)

BD103 commented 4 weeks ago

I released cargo-cache as v1.0.0 and am willing to maintain it, would you all be interested in integrating it with this action?

TimJentzsch commented 4 weeks ago

I think it makes sense. i'd do it like this:

BD103 commented 3 weeks ago
  • Installs to separate directory (not target)

I'm good with all those points, but where do you suggest it be installed instead?

TimJentzsch commented 3 weeks ago
  • Installs to separate directory (not target)

I'm good with all those points, but where do you suggest it be installed instead?

Where exactly doesnt really matter, but it would enable use of a separate cache :)

Of course Im not sure how difficult it would be to generate the cache key for it to work correctly...

TimJentzsch commented 3 weeks ago

Maybe it can be an optional parameter to your action. Also, isnt the default installation path somewhere in ~/.cargo? Maybe we can leverage that

BD103 commented 2 weeks ago

Closed by #28.