rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.5k stars 2.37k forks source link

Cargo crawls subdirectories even when given `--manifest-path` #14240

Open bradzacher opened 1 month ago

bradzacher commented 1 month ago

Problem

At work we have a rather large monorepo - of which rust (currently) makes up a very small part. Some quick numbers using cloc --vcs git -- ~305k files (~45m LOC) of which ~400 files (~80k LOC) are rust.

We've noticed that the time for cargo check --workspace --manifest-path ./Cargo.toml is quite slow - disproportionately slow compared to the sum of running check on each.

Here are the examples from running the commands locally

$ time cargo check -p crate1
$ time cargo check -p crate2
...
Sum: 32.377 total

$ time cargo check --workspace --manifest-path ./Cargo.toml
4.26s user 43.64s system 59% cpu 1:21.14 total

$ time cargo check --manifest-path ./Cargo.toml -p crate1 -p crate2 ...
3.89s user 43.02s system 58% cpu 1:20.16 total

Doing some investigation - in the last two cases it looks like cargo is scanning the entire repo. When inspecting what files that cargo is reading I can see it accessing files and folders not included in those those defined within the workspace's members array that contain no rust code.

Running sudo fs_usage | grep cargo I can see it accessing:

These two cases were actually what first triggered us to investigate we had reports from several engineers about slowness - we noticed two things: 1) the more bazel builds they do, the slower cargo got - and it looks like it's because more builds = more files in the bazel output folders = more crawl time. 2) the more npm packages that were installed, the slower cargo got - similarly more installs = more node_modules = more crawl time.

This is not great for us because it means that certain operations can take 3 times longer than they should - and it impacts IDE startup times too as rust-analyzer runs cargo check --workspace --message-format=json-diagnostic-rendered-ansi --manifest-path /absolute/path/to/Cargo.toml as part of its startup.

Steps

No response

Possible Solution(s)

I'd love a way to tell cargo "please don't scan anything - all you should care about is located right here in the members list" so that we can avoid scanning irrelevant folders.

Alternately - making cargo respect .gitignore during its scan would also be good as it would mean that folders like node_modules and the bazel folders would be skipped entirely.

Notes

All of our dependencies are sourced from crates.io We do not use any git links or patches We do have workspace interdependencies

Please let me know if there are any log files, etc that you would like to help investigate - happy to provide whatever info I can to you.

Version

cargo 1.78.0 (54d8815d0 2024-03-26)
release: 1.78.0
commit-hash: 54d8815d04fa3816edc207bbc4dd36bf18014dbc
commit-date: 2024-03-26
host: aarch64-apple-darwin
libgit2: 1.7.2 (sys:0.18.2 vendored)
libcurl: 8.6.0 (sys:0.4.72+curl-8.6.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w  11 Sep 2023
os: Mac OS 14.5.0 [64-bit]
epage commented 1 month ago

Huh, not too sure what this would be. cargo check--manifest-path Cargo.toml vs cargo check shouldn't be too different. When looking for Cargo.toml, we only walk up the directories, not back down again.

Could you give us an idea of

Even better if you could also run cargo check && CARGO_LOG_PROFILE=true cargo check and open the traces in chrome://tracing and let us know what the hot section is (docs). If possible, this would be best done with 1.79 as we tweak the traces for better information with each release.

bradzacher commented 1 month ago

Your general directory hierarchy

it's all over the place, really - the repo is a mess that's grown organically over time. here's a rough idea of the mess

- /
  - Cargo.toml (workspace)
  - (bazel folders symlinked in the repo root)
    - bazel_out/
    - bazel_bin/
    - bazel_canva/
  - web/ (the frontend codebase
    - node_modules/
    - crate1/ [[rust]] (some WASM rust)
      - Cargo.toml
  - tools/ (a lot of internal tooling)
    - crate2/ [[rust]]
      - Cargo.toml
    - crate3/ [[rust]]
      - Cargo.toml
    - ...
    - js1/
      - node_modules/
    - js2/
      - node_modules/
    - ...
  - backend_service1/ (a backend microservice)
  - backend_service_rust1/ [[rust]] (a backend microservice, written in rust)
    - Cargo.toml

What your workspace.members looks like

it's just a list of folders (i.e. example based on the above)

members = [
  "web/crate1",
  "tools/crate2",
  "tools/crate3",
  "backend_service_rust1",
  ...
  "path/to/crate12", # there are only 12 crates in our monorepo thus far
]
bradzacher commented 1 month ago

Trace (with 1.78): trace-1720750556589970.json

image

I'm not sure how to interpret the trace so here it is in its raw form. The bulk of the time is spent in "Name prepare_target, Category cargo::core::compiler::fingerprint"

bradzacher commented 1 month ago

Trace (with 1.79): trace-1720751045123202.json

image

Looks to be much the same

Xuanwo commented 1 month ago

How about making cargo respect .gitignore, or alternatively, introducing .cargoignore when managing workspaces?

epage commented 1 month ago

@bradzacher Thanks for that context! It looks like only one specific prepare_target call is taking up all of that time. If you re-run with cargo check && CARGO_LOG_PROFILE=true CARGO_LOG_PROFILE_CAPTURE_ARGS=true cargo check then the traces will include the Package ID of the package that is causing this. From that we can try to understand what is different about this package that it is hitting this case in cargo. In particular, if there are is a build script and what directives it emits, symlinks inside its directory, an odd location on disk, etc.

@Xuanwo I don't even know what file walk this is to evaluate an answer like that. One theory is a build.rs that gives a bad path for for re-run-if-changed or we have some bug with how we are walking the directory structure for one of the rebuild-detection checks we do.

weihanglo commented 1 month ago

How about making cargo respect .gitignore, or alternatively, introducing .cargoignore when managing workspaces?

@Xuanwo it already does. The presence of exclude and include fields would affect whether .gitignore is respected.

weihanglo commented 1 month ago

Some more questions:

bradzacher commented 1 month ago

Is there any Cargo.toml containing odd package.include that refer to relative parent directory like ..?

No. There is no package.include defined in any Cargo.toml

Is there any symlink under a member package pointing to directories outside the package root, for example to the workspace root?

Two crates contain a node_modules which have a symlinks (as pnpm does symlink installs) The symlink is from the node_modules back to the root node_modules. I.e. from tools/crate2/node_modules/typescript to ../../../node_modules/.pnpm/typescript@5.4.5/node_modules/typescript (note ../../.. resolves to the repo root)

Other than that - no links at all

Any builds script rerun-if-changed has relative paths?

There are two: one outputs cargo:rerun-if-changed=tools/crate2/folder the other outputs this (where ../../ resolves to the repo root)

cargo:rerun-if-changed=../../tools/crate3/some/file.foo
cargo:rerun-if-changed=../../
bradzacher commented 1 month ago

If you re-run with cargo check && CARGO_LOG_PROFILE=true CARGO_LOG_PROFILE_CAPTURE_ARGS=true cargo check then the traces will include the Package ID of the package that is causing this

I reran it and the profile says that tools/crate3 (the one with the relative rerun-if-changed path) is the one that's responsible for that massive block.

bradzacher commented 1 month ago

And it only just clicked that OFC the fact that it's dumping cargo:rerun-if-changed=../../ is what's causing cargo to watch the entire repo and crawl everything.

epage commented 1 month ago

I wonder if we should track how long rerun-if-changed took and produce a warning if its above a certain threshold. Unsure how often something like this would come up if it'd be worth it.

EDIT: Something cheaper than a warning is to just log the file walk we do for rerun-if-changed and then someone using the log profiling can find it.

bradzacher commented 1 month ago

So that rerun-if-changed came from tonic. TL;DR cos we build with both cargo and bazel - we need to make the two work in sync. Bazel runs everything from the repo root. So when we generate code from proto files using tonic we need to tell it to resolve the proto paths relative to the repo root (or else tonic would generate different code when run via bazel or via cargo).

So the tonic config includes this config:

        .compile(
            /* protos:   */ &["../../tools/crate3/some/file.foo"],
            /* includes: */ &["../../"],
        )?

Tonic automatically emits cargo:rerun-if-changed directives - hence it emitted cargo:rerun-if-changed=../../.

Thankfully tonic's builder includes emit_rerun_if_changed(bool) - so we can disable its default and manually emit a sane one instead (cargo:rerun-if-changed=tools/crate3/some/file.foo)

bradzacher commented 1 month ago

After making the above change (adding .emit_rerun_if_changed(false) and adding a manual emit) things are looking MUCH better:

image trace-1720761384226608.json

$ time cargo check --workspace --manifest-path ./Cargo.toml
1.66s user 0.33s system 74% cpu 2.666 total

πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰πŸŽ‰

That's more like it

weihanglo commented 1 month ago

Nice!

The follow-up question then will be: Should paths from rerun-if-changed be normalized?

I believe that could also resolve this issue because .gitignore is already respected. Note that it is discouraged for a rerun-if-changed to refer to paths outside package root, as it breaks the self-contained property of a Cargo package.

A normalization would happen here if we decide to go that route.

bradzacher commented 1 month ago

Following on from that - it would be good to have an error when you break encapsulation like this. I would suspect that in the vast majority of cases breaking encapsulation is not the intended result.

Forcing someone to emit like cargo::rerun-if-changed-allow-outside-package-folder=true or something to silence the error would help people catch accidental cases like this and provide huge perf wins!

Alternately - being able to lint for this (eg via clippy) would be good - but likely hard-to-impossible unless clippy can scan the target/debug/foo/output file for problems.

epage commented 1 month ago

We can't error because of compatibility. We are hesitant about doing things related to build scripts on Edition boundaries because of cases like tonic where you delegate your build script written in one edition to another package which could be written in a different edition.

As for a warning, we avoid those right now unless there is a use actionable way to disable it without changing their behavior. We'll soon have lint control which will would be a way to do so.

bradzacher commented 1 month ago
$ cargo check --workspace --manifest-path Cargo.toml
$ time cargo check --workspace --manifest-path Cargo.toml

before:

 3.89s user 42.38s system 58% cpu 1:18.62 total

after:

 0.15s user 0.12s system 84% cpu 0.319 total