rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.71k stars 2.41k forks source link

Support `cargo clean --workspace` #14720

Open xxchan opened 2 days ago

xxchan commented 2 days ago

Problem

We have a large workspace (~500k loc). target dir bloats quickly, but cargo clean will let us recompile all dependencies (We have >1000 dependencies, and some dependencies take minutes to compile).

Recompiling dependencies is unnecessary, and it wastes time and energy.

We only want to clean build artifacts of "our own code". Actually it also takes up most of the space:

1.1G   │ ┌── risingwave
1.4G   │ ├── build
343M   │ │   ┌── s-h14437pi5v-13pt08j-dwag002k0abj4ylcfstlvyjeo
343M   │ │ ┌─┴ risingwave_stream-2qi109ipc00fa
361M   │ │ │ ┌── s-h1442qartc-12uu8de-1oe9soksc9wjl6inun2bsr3ts
361M   │ │ ├─┴ risingwave_expr_impl-2y2nvlr6oosas
384M   │ │ │ ┌── s-h1443fexwf-1wq0ljy-0ixglmmttspe0nfoa0e2nexap
384M   │ │ ├─┴ risingwave_frontend-00fb3jeuoq12s
3.4G   │ ├─┴ incremental
320M   │ │ ┌── librisingwave_connector-59acd07e4d1dce73.rlib
381M   │ │ ├── librisingwave_storage-26c2772e948fe4df.rlib
430M   │ │ ├── librisingwave_meta-e1e3edc30da6a88b.rlib
601M   │ │ ├── librisingwave_frontend-5241cdc2e280b9c9.rlib
1.1G   │ │ ├── librisingwave_stream-0896a913c629f6c3.rlib
1.1G   │ │ ├── risingwave-393af56edf3487b4
 47G   │ ├─┴ deps
 53G   ├─┴ debug
 70G ┌─┴ target

Current workarounds

Crates in my workspace have a common prefix, like risingwave-xxx. So I could just find & delete:

> find target -name '*risingwave*' -type d -exec rm -rf {} + -o -name '*risingwave*' -delete

A more cargo native solution (mentioned by @weihanglo):

cargo metadata | \
  jq -r '.workspace_members[]' | \
  xargs -n1 cargo clean -p

But this turns out to be much slower than the brute force way above. @weihanglo says: we should make -p in cargo-clean also aware of glob syntax. Then it might be faster. But I'm thinking sth like --workspace-members is more straightforward, and I'm not sure whether there's other cases we want a regexp clean.

Proposed Solution

Add a flag --workspace-members for cargo clean, which is essentially

cargo metadata | \
  jq -r '.workspace_members[]' | \
  xargs -n1 cargo clean -p

(but more efficient)

Notes

Original posted at https://x.com/yayale_umi/status/1848921059011268944

epage commented 2 days ago

For what to name the flag, we generally call it --workspace. Most commands that have that flag have a different default package selection policy. However, we've already broken from that with cargo update and cargo clean would closely match cargo updates behavior, so maybe the precedence is strong enough to not worry about.

With the cargo update --workspace precedence, personally I don't have too much of a concern with adding this flag. @weihanglo thoughts?

However, I'm not too clear on the use case. Why do you need to clear things out? I'm assuming running the next build will cause these files to just come back, meaning there isn't a significant savings but the next build will be slower (because of the lack of incremental build cache).

Also, I wonder if there is a way for the incremental cache to be smaller. I didn't see any existing open issues for the size

weihanglo commented 2 days ago

I don't have too much of a concern with adding this flag. @weihanglo thoughts?

I don't either, though might be worth considering the interaction with per-user caches https://github.com/rust-lang/cargo/issues/5931 together.

But this turns out to be much slower than the brute force way above

Just note that cargo clean has rooms to improve its performance https://github.com/rust-lang/cargo/issues/10552.

xxchan commented 2 days ago

However, I'm not too clear on the use case. Why do you need to clear things out?

I remember the space can keep going up to like 100 or 200G. After the clean, the space can go down a lot. I can come back later to show what takes up the space when it bloats again.