rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.55k stars 2.38k forks source link

Tracking Issue for garbage collection #12633

Open ehuss opened 1 year ago

ehuss commented 1 year ago

Summary

Original proposal: https://hackmd.io/@rust-cargo-team/SJT-p_rL2 Implementation: #12634 Documentation: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#gc Issues: https://github.com/rust-lang/cargo/labels/Z-gc

The -Zgc flag enable garbage collection for deleting old, unused files in cargo's cache.

Unresolved Issues

Future Extensions

No response

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

epage commented 12 months ago

For when we get to target/:

From @bjorn3 at https://hachyderm.io/@bjorn3/111047792430714997

The incr comp cache is GCed on every compilation by only copying artifacts that were used for the current compilation session to the new incr comp cache dir and then removing the old incr comp cache dir entirely. In effect it is a maximally eager semi-space garbage collector. This is not suitable for cargo at all because alternating between cargo build -p foo and cargo build -p bar is guaranteed to rebuild things every time as each build would delete the artifacts of the other build.

epage commented 12 months ago

Quick scan of brew

One complaint that came up was " "brew cleanup has not been run in 30 days, running now" ... and then proceeds to run an interminable process in the middle of you attempting to do something else." (mastadon)

epage commented 12 months ago

From https://hachyderm.io/@nathanhammond@mastodon.social/111048319933010803

Also: global vs. local? Just keep all things local. Errant cache hits pulling from a non-scoped global cache are a disaster. Errant cache hits from the same local cache are a "whoops." For shared artifacts (e.g. a singular tar downloaded from the registry), global is fine, and should just grow unbounded. (Seriously: nobody cares.)

pnpm, bun, npm, yarn: package cache is intended to be shared global immutable, unbounded growth.

I'm assuming our global package cache is to help with CI but we should probably explicitly document out priority use cases.

mcclure commented 11 months ago

Is -Zgc the final planned CLI interface?

To my mind clearing the global cache is a "verb" so it being something like "cargo clean-cache" would be discoverable but "-Zgc" (to what command?) would be less so.

In addition, it would be nice to have the option of fully deleting the global cache through the CLI in addition to just GCing it (you may say "but you just delete ~/.cargo", but if one is using Windows it is not intuitively obvious whether a cache would be stored in \Users\username, %APPDATA%, or one of a couple other locations)

epage commented 11 months ago

Unstable cargo features require two things

-Zgc is the flag for enabling support for this unstable feature but it isn't the interface for the unstable feature (in some cases, we do make them overlap but that isn't that common).

12634 has an interface but my plan is to avoid bike shedding that when I review it just so we get the infrastructure in and then we can worry about both the infrastructure and the CLI as we work towards stabilization.

In addition, it would be nice to have the option of fully deleting the global cache through the CLI in addition to just GCing it (you may say "but you just delete ~/.cargo", but if one is using Windows it is not intuitively obvious whether a cache would be stored in \Users\username, %APPDATA%, or one of a couple other locations)

3289 is the more specific issue for that. Depending on the controls offered, this is usually supported directly or by saying things like "reduce cache size to 0B".

mcclure commented 11 months ago

Thank you for explaining, I apologize for my confusion.

ehuss commented 6 months ago

In https://rust-lang.zulipchat.com/#narrow/stream/246057-t-cargo/topic/Stabilizing.20global.20cache.20tracking/near/422500781 I am proposing to stabilizing just the recording of the cache data as a first step. This doesn't enable automatic or manual gc.

airstrike commented 4 months ago

I would wager most rust users associate the words "garbage collection" with memory rather than cached files that have gone stale. It's unfortunate that the term is being overloaded here

ssokolow commented 4 months ago

I would wager most rust users associate the words "garbage collection" with memory rather than cached files that have gone stale. It's unfortunate that the term is being overloaded here

It depends. Some of us are familiar with the git gc command.

epage commented 4 months ago

The feature is in development and how we present it to the user is not yet decided. In #13060, we are exploring how to present it in the CLI, including looking at prior art from other tools.

cessen commented 4 months ago

Thanks for redirecting me to the naming discussion here @epage

I think(?) one of the distinguishing characteristics of garbage collection implementations (whether they be for memory, git, nix, etc.) is that they remove things that are "unreachable" in some sense, and thus can be confidently disposed of as not used. That particular characteristic is specifically not true of this feature, as discussed in #13176.

Having said that, in practice I'm skeptical if calling this feature "garbage collection" is actually going to confuse people. Nevertheless, it does seem like one of those "might as well be more accurate" kind of situations. So calling it "cache cleaning" or similar.

ehuss commented 1 month ago

I have proposed to stabilize the automatic side of this feature in https://github.com/rust-lang/cargo/pull/14287.