Open ehuss opened 3 years ago
I also feel not right to replicate the logic. Love to know the downsides about reuse UnitGraph
. Could you explain more?
The main problem is that UnitGraph is based on compilation units, whereas cargo tree
wants the dependencies between packages. There are dependencies in UnitGraph that need to be filtered out.
For example, all the intra-package dependencies would need to be removed. Also, UnitGraph has multiple roots for a package (for example, if the package has multiple bins). Another example is build scripts, where the "run" unit depends on all transitive build scripts being run. Those are not real dependencies to be displayed. And conversely there is information missing from UnitGraph (like which dependencies are build-dependencies or dev-dependencies). That information is lost since the Resolve is discarded after the graph is computed.
Another problem is that the Graph built for cargo tree
has a fundamentally different structure, and changes based on command-line options. All that code for build the graph will probably need to stay to handle all of that.
A concern is that attempting to share the code will end up needing more code to handle the impedance mismatch, and will be harder to maintain.
So, I think this is a fairly hard problem, mostly trying to balance the maintainability of the code. I don't really know what the right solution is.
Hello. I would like to pick this task up. I will timebox 1 week for this, before I start my new job. If I haven't made any progress by then, I will park it and let someone else have a go.
(for context: I am currently using a copy-pasta of cargo::ops::tree::graph::Graph
in my cargo-quickbuild prototype, and its proc-macro handling is causing me to calculate the wrong sets of features for some transitive dependencies. This causes fingerprint mismatches, so cargo build
then recompiles large swathes of the crate graph and ruins all my fun)
What should I be aiming for here? One possible approach might be:
UnitGraph
to a cargo::ops::tree::graph::Graph
cargo::ops::tree::build()
with the one that goes via UnitGraph, and make sure its tests still pass.This is quite an invasive approach, and probably isn't feasible in 1 week. I know I could just try to fix my immediate problem as a one-off, but it feels like it will be a game of whack-a-mole (see also https://github.com/rust-lang/cargo/issues/10651). Having a fuzzer would let me feel much more confident in the correctness of any fix I write).
I wonder if we could write the UnitGraph -> Graph conversion function from (1) (but not its inverse), and then write a fuzzer that sets up a random cargo workspace, and then checks that cargo::ops::tree::build()
(with a very specific set of command-line options) matches the output from the UnitGraph -> Graph conversion function.
I haven't quite put my finger on what a random workspace generator would look like, or whether it would be able to cover all of the cases that we care about.
Does anyone have any thoughts, or should I just have a crack at it and see how far I get?
Looks like a big plan! Previously I was asking using unit graph instead, but the reply from ehuss made me wonder it may be really challenging.
I would recommend experimenting on what you list in the first step, and then assert against the current output first. Fuzzer/protest can be added after we're all happy with the change.
I cannot guarantee things will get merged, though it is still great to explore any possibility to reduce code complexity and discrepancy between cargo-tree
and other build commands :)
Thanks for your feedback. I will be working in the open on #11379, if you want to follow along (it's not very interesting at the moment though).
I would recommend experimenting on what you list in the first step, and then assert against the current output first. Fuzzer/protest can be added after we're all happy with the change.
Yeah, it looks like there are already quite a lot of cargo tree
integration tests, so I should definitely get those passing before I think about fuzz tests.
I cannot guarantee things will get merged, though it is still great to explore any possibility to reduce code complexity and discrepancy between cargo-tree and other build commands :)
That's fine - I need to build something like this for cargo quickbuild
anyway, so it won't be wasted effort.
I wonder is this why I got such strange output from cargo tree -d --format "{p} {f}"
for such Cargo.toml:
[dependencies]
async-compression = "=0.3.15"
nmea = "=0.2.0"
[build-dependencies]
bindgen = { version = "0.64.0", default-features = false, features = ["runtime"] }
memchr v2.7.2 alloc,default,std
└── async-compression v0.3.15 default
└── foo v0.1.0 (/private/tmp/foo)
memchr v2.7.2 alloc,std
└── nom v7.1.3 alloc,std
├── cexpr v0.6.0
│ └── bindgen v0.64.0 runtime
│ [build-dependencies]
│ └── foo v0.1.0 (/private/tmp/foo)
└── nmea v0.2.0 default,std
└── foo v0.1.0 (/private/tmp/foo)
according to cargo tree
each runtime dependency async-compression
and nmea
will use memchr
with different feature sets.
This is how it caculates duplicates currently, so by the implementation a dependency with a different feature set is considered a duplicate.
Problem There are some situations where
cargo tree
will mark a package as a duplicate(*)
when it probably shouldn't (and shows the wrong features with--no-dedupe
). This arises when a package is built twice with the same features, but with different dependencies.A real-world example is when looking at the following:
Running
cargo tree -f '{p} {f}'
results in:The problem is the diesel-issue → diesel_migrations → migrations_macros → migrations_internals. It has a
(*)
to indicate that it is duplicated. However, migrations_internals is built twice, and this is deceiving because the second migrations_internals has a dependency on diesel with no features.Running with
--no-dedupe
is even worse, because it shows the wrong features fordiesel
under migrations_internals.Steps The following is an example in Cargo's testsuite format:
Possible Solution(s) The issue is that
cargo tree
doesn't have the same logic that was added in #8701 to accommodate dependencies that are the same, but link to different dependencies.I have toyed with the idea of changing
cargo tree
to use theUnitGraph
computed for a normal build instead of trying to recreate how some of these things are computed. There are some complexities and downsides to that approach, but I continue to feel that trying to replicate this logic in multiple places is not a good idea.Notes
Output of
cargo version
: cargo 1.54.0-nightly (44456677b 2021-06-12)