rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.76k stars 2.42k forks source link

`cargo vendor` pull large amounts of unrelated crates. #11929

Closed WhyNotHugo closed 1 year ago

WhyNotHugo commented 1 year ago

Problem

When I run cargo vendor, it downloads a huge amount of crates that are not dependencies for my codebase. The total size of the vendor directory is 230M, but the unnecessary dependencies are over 182M (slightly over 80%).

cargo tree confirms that none of these are [transitive] dependencies for my project.

Most of these seem to be winapi and its subtree of dependencies. However, my code has nothing windows-related, and probably won't even run on windows. I suspect, however, that cargo by default assumes that it does, and that some transitive dependency depends on one of these (Although I'm not sure why cargo tree won't list them).

Steps

  1. git clone https://git.sr.ht/~whynothugo/shotman
  2. cd shotman
  3. cargo vendor

Possible Solution(s)

Does cargo has a way of determining on which targets my code actually builds? This would be the ideal approach so it can just ignore unnecessary dependencies. I don't think my code would build on windows, since it uses some basic things like file-descriptors and other Unix-only features.

Otherwise, being able to explicitly specify "this won't build on windows" would be useful to avoid downloading all the dependencies that are specific for that target.

Notes

Since cargo tree doesn't list these crates as dependencies, there's also no obvious way for me to even figure out where they're coming from.

Version

> cargo version --verbose
cargo 1.68.2
release: 1.68.2
host: x86_64-alpine-linux-musl
libgit2: 1.5.0 (sys:0.16.0 vendored)
libcurl: 8.0.1 (sys:0.4.59+curl-7.86.0 system ssl:OpenSSL/3.1.0)
os: Alpine Linux 3.18_alpha20230208 [64-bit]
epage commented 1 year ago

Related: #7058

WhyNotHugo commented 1 year ago

Good reference, thanks. They're slightly difference scenarios, but hopefully a common fix can address both.

WhyNotHugo commented 1 year ago

Any ideas how to make these show up in cargo tree?

epage commented 1 year ago

I wasn't sure, so I ran cargo tree -h

 --target <TRIPLE>       Filter dependencies matching the given target-triple (default host platform). Pass `all` to include all targets.

Looks like --target all will show it for you

WhyNotHugo commented 1 year ago

Thanks!

WhyNotHugo commented 1 year ago

It seems that if all my top-level dependencies are defined as [target.'cfg(unix)'.dependencies], transitive dependencies which are marked as [target.'cfg(windows)'.dependencies] as still included in the cargo tree and cargo vendor.

weihanglo commented 1 year ago

Going to close this as a duplicate of #7058. They seem best to be discussed as a whole. Let us know if this is wrong and we can consider reopen.

WhyNotHugo commented 1 year ago

Sure, hopefully a common solution will fix both issues.

eslerm commented 10 months ago

hi @weihanglo o/

I believe this issue is broader than single platform vendored packages, like #7058.

If you make a simple hello world package with a dependency to clap, running cargo vendor will pull in a bunch of dependencies for clap that the hello world base package won't actually use.

This issue seems to be described on https://wiki.ubuntu.com/RustCodeInMain

It’s a simple matter of running cargo vendor where your on the top-level directory. Sadly, it’s not possible to exclude irrelevant dependencies during vendoring yet, so you might want to automate that step and add some post-processing to remove voluminous, unused dependencies, and/or the C code for some system libraries that could be statically linked.

Possibly an AST could determine which dependencies the base package actually uses, to debloat vendored packages.

weihanglo commented 10 months ago

@eslerm Cargo knows nothing about Rust AST, so this is out of the scope of Cargo right now.

eslerm commented 8 months ago

@weihanglo could this issue please be re-opened as unresolved?

I understand that this feature might not be added. By having an accurate issue, distros with vendored rust packages can be pointed here :pray:

epage commented 8 months ago

They can still be pointed here.

Re-opening miscommunicates our intent with the community and makes it harder to track our backlog.

WhyNotHugo commented 8 months ago

See https://github.com/rust-lang/cargo/issues/7058

eslerm commented 8 months ago

Thanks everyone.

7058 has a narrower scope and doesn't fully overlap with this one, but resolving it would help immensely. (A lot of bandwidth is being spent vendoring unnecessary Windows crates for Linux software.)

My hunch is that a secondary tool will be needed to lint the results of cargo vendor.

A colleague pointed out that an AST would be complicated, as different cfg() parameters would need to be accounted for and that projects might need to be patched to use resolver = "2" so that cargo won't look at cfg-outed dependencies. There's a related discussion about this that I'll edit into this comment when I find it.

edit: https://poignardazur.github.io/2021/02/15/rust-wishlist-better-cfg/

eslerm commented 5 months ago

Gnome snapshot removed over 200M of unnecessary vendoring with https://github.com/coreos/cargo-vendor-filterer

https://gitlab.gnome.org/GNOME/snapshot/-/commit/e52debce683338bce43a6150edb07e8c30efc617