Cargo should understand distinctions between public/private dependencies

sfackler commented 9 years ago

Current report

Dependencies for crates today can be separated into two categories: those that are implementation details and those whose types are present in the public API of the crate. Cargo does not support distinguishing these two cases today, but it can often be crucial for dependency management.

If a dependency is an implementation detail (i.e. no types are exposed) then it can be easily duplicated in the dependency graph without worry (other than concerns like binary size). In other words, Cargo's safe to have multiple semver-incompatible versions of this dependency because they won't talk with one another.

If a dependency is part of the public API, however, then it's critical that any crate which can share this type it needs to all be of the same version. In other words, Cargo's logic of allowing multiple semver-incompatible versions is almost always guaranteed to go awry.

If Cargo were to understand a distinction between these private/public (or internal/exposed) dependencies then it could solve problems like:

Provide much nicer resolution error messages rather than relying on the compiler to figure this out.
Restructure the crate DAG to ensure that dependencies which work with multiple major versions end up working with other crates that only require one major version.
Ensure that crate resolution never picks too many versions of a crate and instead forces common dependencies to all have the same version.
Original report

Cargo is overeager to pull in multiple copies of a crate

Say we have a crate with the following Cargo.toml:

[package]
name = "foo"
version = "0.1.0"
authors = ["Steven Fackler <sfackler@gmail.com>"]

[dependencies]
postgres = "0.9"
r2d2_postgres = "0.9"

r2d2_postgres version 0.9.3 depends on postgres version 0.10, while version 0.9.2 and lower depend on postgres version 0.9. Cargo would ideally pick r2d2_postgres version 0.9.2 and postgres version 0.9.6, but it instead picks r2d2_postgres version 0.9.3, which pulls in postgres version 0.10, as well as postgres version 0.9.6. This ends up totally blocking any use of these dependencies.

This issue rather significantly impedes crates' abilities to upgrade their dependencies.

It seems like Cargo's resolution logic should have the property that if there exists a resolved dependency graph with no more than one copy of each dependency, it should never pull in more than one copy of a dependency. Unfortunately, questions like "does such a resolved dependency graph exist" are NP-complete, but I suspect that it's tractable in practice (we don't expect crates to be randomly permuting their dependencies every minor version, for example).

In the meantime before someone has a chance to integrate a SAT solver into Cargo, I think there are some things we can do to help users correct Cargo when it fails to resolve things "properly". For example, cargo update could warn whenever it has to pull in multiple copies of a crate, identifying the portion of the object graph that forced that to happen. This should provide users an easier starting point to poking the graph into shape with cargo update --precise.

sfackler commented 9 years ago

The SAT solver reference was not a joke btw - the yum package manager uses a SAT solver to do its job.

alexcrichton commented 9 years ago

Yeah I've been worried about this in the past as well. Right now the resolver is kinda just a pile of heuristics and then it returns the first solution it always finds (despite there being many, in situations such as this).

It may be the case that we could avoid a full SAT solver by applying smarter heuristics for the time being (e.g. we'd need some sort of "scoring system" for any solution from a SAT solver anyway), and that may also keep the resolution stage speedy.

alexcrichton commented 9 years ago

cc @wycats, curious on your thoughts about this

sfackler commented 9 years ago

Yeah, the scoring system is what I'm most unsure about. It should prefer newer versions of libraries to older versions and try to minimize the number of copies of dependencies, but how those should interplay and factor into the solver seem like pretty nontrivial questions.

SSheldon commented 8 years ago

I also noticed that, with a setup like:

[package]
name = "foo"

[dependencies]
bar = "*"
libc = ">= 0.1, < 0.3"

[package]
name = "bar"

[dependencies]
libc = "0.1"

cargo will pull in multiple copies of libc at version 0.1 and 0.2, even though the requirements can be satisfied with just 0.1. (Same issue happens with a wildcard dependency.)

Is that the same issue as here? It seems related but like a fix might be different.

alexcrichton commented 8 years ago

Yeah I think it's basically the same issue as this.

SSheldon commented 8 years ago

I think this issue really limits the usefulness of dependencies in crates and makes life hard for library authors. Particularly, it makes it dangerous to use any types in your public API that are defined by another library.

To use these types in your public API, users must add the other library as one of their dependencies. The hard part, however, is that if these dependencies are at different versions it won't compile.

For example, my objc crate works fine with either libc 0.1 or 0.2. I described the dependency as such in my Cargo.toml and assumed that being more widely compliant would be best so that the dependency doesn't conflict with other libraries that require a specific version. However, what's happened is that cargo always chooses to satisfy that dependency with 0.2, even when the user is using libc 0.1, resulting in a conflict and compiler error.

The result is that, currently, you're forced to choose a specific version for your crate and make the user use it, as well. If a dependency releases a new version, allowing use of it requires dropping support for the old version and publishing a breaking change, splitting your users and meaning the older versions stop receiving updates. This is frustrating for users, because they must ensure all their dependencies use the same version of a crate and they will have to update their dependencies more often as there are more frequent breaking changes.

Not to get melodramatic, but I think easing these multiple version issues will be important for rust and cargo to be pleasant to use in the future. I'm hoping this issue is on the radar of folks smarter than me and is being thought about, sorry if this turned into a bit of a rant.

huonw commented 8 years ago

@SSheldon I believe people should be able to force the use of specific versions with cargo update -p NAME --precise VERSION, fwiw. (Not as nice as changing cargo's behaviour, but it does mean things shouldn't be fundamentally breaking/require dropping support.)

SSheldon commented 8 years ago

@huonw yeah, it can be avoided if you're careful with your Cargo.lock. Though libraries are recommended to not commit that, so each new user would have to be required to do this.

Maybe another that'd help here would be a way to say in your Cargo.toml, for example, "I want to use the objc crate and I want it to use libc 0.1"? Currently I don't know of any way to do that besides overriding the Cargo file of your dependency.

graydon commented 8 years ago

I've been playing with hooking the Z3 SMT solver up to rust recently and realized this bug exists in cargo, so I sketched out a little semver-solver using it. I know it's a bit like breaking a walnut with a tank (Z3 is a sort of intense 14MB C++ library) but since Z3 (as of v4.4.1) also contains a numerical optimizer it's really trivial to express a mixed constraint-solving / optimization problem with it. It's just a bit of plumbing, maintaining a mapping from packages to Z3 terms, package-versions to integers, and expressing version constraints as integer inequalities on the terms.

Code's here: https://github.com/graydon/z3-rs/blob/master/tests/semver_tests.rs#L158-L267 Crate's here: https://crates.io/crates/z3/

sfackler commented 8 years ago

That's awesome!

erickt commented 8 years ago

SUSE's ZYpp uses libsolv. Yum might also switch to it in DNF. It's only 1.2MB of C, so it's smaller than Z3 :)

wycats commented 8 years ago

For what it's worth, the problem of supporting duplicates makes the exact constraints (and their weights) a little bit more complicated than the usual approach to this problem.

In most versions of this problem, dependency "conflicts" simply fail to resolve, but we have another option available to us. That said, over time, I've increasingly come to believe that we were too willing to use that solution in Cargo, and we'll probably ratchet down our usage of allowed-duplicates somewhat.

graydon commented 8 years ago

Yeah, I think cargo casually permitting multiple copies of a dependency in a project is ... a dangerous default. It needs to be permitted (and we engineered the symbol mangling scheme to explicitly support it) but it's also relatively likely that any given instance of it is a bug. A potentially-very-subtle bug, like "two instances of a library that both think they are in charge of talking to OS subsystem / native library foo"

I think it should probably default to not-working, and require opting in. Though it might also break the ecosystem today if you set that to be the case. I'd be interested to run the experiment on the existing package set, see how many of them can't build with their existing dependency declarations, and if so how much work it'd take / how many duplicate-opt-ins or dependency-range-adjustments to fix.

wycats commented 8 years ago

@graydon I think it's a little more subtle than that.

"Shared" dependencies (dependencies whose types are used in other pub types, or which are transported to other crates through type inference) must not be duplicated.

"Internal" dependencies (dependencies whose types are only used internally) can be safely duplicated (but you might not want to for other reasons, like binary size).

We've talked some about how packages could say that a dependency was "internal", which would then permit duplication and error out if the type was actually made public. Even if we could perfectly detect this, it would still be important to explicitly opt in, because switching from an "internal" to "shared" dependency would be a breaking change to downstream crates.

graydon commented 8 years ago

Hm. I think either we're talking past each other, or I disagree with your interpretation here, or something's changed I don't understand in the problem space. As far as I know (and as it was initially designed), if crate X exports type T, two copies of X (with the same or different versions, doesn't matter) can both be imported independently in the same composite crate w/o any linkage or type system problems. Their types should be given independent, disjoint identities in any composition of them.

The risk is only, as you say, in binary-size and (imo more seriously) violating uniqueness assumptions each copy of X might assume, such as "I'm the one that calls the init routine on a C library I'm linked against" or "I'm in charge of handling sigusr1" or whatever.

If I'm wrong about this, something has .. significantly changed in the design. What part do you think goes wrong?

wycats commented 8 years ago

Here's a simple example of the problem I'm talking about:

a time crate exports pub struct Time { ... }
a format_time crate depends on time and uses time::Time in its signatures
a fancy_time crate provides helpers for producing a time::Time

Now let's say another crate (datetime) depends on format_time and fancy_time. If it tries to pass a time::Time provided by fancy_time into format_time, they better be talking about the identical type.

It's possible for datetime to see two different time::Times and that part will work fine, but one of the primary goals of package naming and versioning (perhaps the primary goal) is to make it possible for humans to compose packages like this and have any hope of things working.

In this case, both fancy_time and format_time are treating the time dependency as a shared dependency, which is the most direct way I know of expressing this constraint ("they better both be talking about identical types").

graydon commented 8 years ago

Ok, that's .. definitely an instance where you'd want non-duplication. And I can think of others! I mentioned some above. But it's not a hard requirement of any of the parts in the system that "public types" implies "can't duplicate", right? I'm just trying to get clear on what you're arguing. Last time I discussed this (in the context of linkage), I was assured that cargo passed a disambiguating sort of name/ver/source triple of metadata to rustc for each crate in a project, such that two copies of the same symbolic-named crate wouldn't clash at metadata-load or link time. I hope that's still true, even if I think it should not be the default behavior.

To be clear on what I'm arguing:

I think probably most users think shared resolution is already happening, consider duplication a bug
I think you could run the experiment on crates.io to see how badly things break defaulting to shared
If there's not much breakage, consider making it default
If there's a lot of breakage:
- I suspect it's because of semver misuse as much as anything
- I suspect downstream users will also be sympathetic to fixing semvers / opting in to duplication in cases they can't work around it
- If it's too painful to set as a default, I think it should at very least become a noisy warning. Binary size alone makes it worth mentioning, and I think there are serious and subtle bugs that come along with it.

sfackler commented 8 years ago

I was assured that cargo passed a disambiguating sort of name/ver/source triple of metadata to rustc for each crate in a project, such that two copies of the same symbolic-named crate wouldn't clash at metadata-load or link time.

Yep. Cargo will do that so there won't be any linker errors, but you will run into compile errors like "Expected type foo::Bar but got type foo::Bar" when the two versions of libfoo end up interacting.

graydon commented 8 years ago

Right. Exactly what I'd expect! Good.

Also deeply confusing to any user, no doubt, who didn't ask for it. Hence why I'd make it default to not happen :)

wycats commented 8 years ago

But it's not a hard requirement of any of the parts in the system that "public types" implies "can't duplicate", right?

No part of the system mandates this, but if connected parts of the dependency graph get ahold of same-named types from different versions of the same crate, you'll end up with dependency graphs that resolve but can't compile.

That's possible today, but a long-term goal of mine for Cargo is to make resolves-but-does-not-compile errors impossible (or at least as close to impossible as we can manage).

I think probably most users think shared resolution is already happening, consider duplication a bug

Right, but where sharing is not required, because a crate is used purely internally, allowed duplication is a very nice property to be able to opt into. In my opinion, "this crate is being used purely internally" is the nicest way to express what you mean in a way that Cargo/Rust can help you with. It will also mean that future mistakes (or contributions!) that change the assumption of "internal only" have a prayer of being understood as take mistakes they are.

If there's not much breakage, consider making it default

I totally agree that we should consider making it the default. I think we may be in violent agreement. The main thing I'm arguing is that the opt-in should be of the form "I'm using this crate internally" and not "allow duplication", which is an easy knob to smash on, but doesn't communicate the real constraint very well.

I think there are serious and subtle bugs that come along with it.

I strongly believe that we can decouple the bugs caused by inappropriately sharing dependencies from concerns like binary size (which are also very real).

graydon commented 8 years ago

if connected parts of the dependency graph get ahold of same-named types from different versions of the same crate, you'll end up with dependency graphs that resolve but can't compile

Can you spell out a (further) example of this, just for the sake of my understanding? I'm not sure what exactly it means and I suspect it's important!

Sorry for all the back and forth, I really appreciate you illuminating the topic. The set of ways this stuff can go wrong (not to mention set of ways users need legitimate escape hatches / variable behavior) is .. extensive!

wycats commented 8 years ago

Can you spell out a (further) example of this, just for the sake of my understanding?

A recent example of this was a seemingly-minor change to the libc crate: We made a "minor change" (iirc, making the char typedef unsigned by default on some platforms) which reasonably qualified as a bugfix. But the libc crate was used by a huge number of other low'ish level crates, including use inside of other public structs.

Duplication meant that there were now multiple versions of libc::c_char floating around as people starting to upgrade their libraries to the new dependencies. If a struct containing a 0.1.x c_char was used with a library expecting a 0.2.x c_char, it quite often resulted in a compilation error.

The real problem here was that all of these crates were sharing c_char, but Cargo was (implicitly, in the current semantics) treating them as internal dependencies, allowing them to be freely duplicated.

I agree with you: the default should be "shared" if we can get away with it, and people should opt in to "internal" only if they are willing to make that commitment and expressly want to allow duplication as a conflict resolution strategy. At the same time, we should provide at-least-lints that help detect dependencies that are declared internal but are actually shared.

graydon commented 8 years ago

Oh ok, this is just an example of duplication gone wrong. I thought you meant "resolve but don't compile" meaning "resolve to a shared version, but don't compile". I think that can also happen if, for example, the semver constraints are wrong. I was wondering if you meant that. Anyway, yeah, I think this is all violent agreement :)

wycats commented 8 years ago

Oh ok, this is just an example of duplication gone wrong. I thought you meant "resolve but don't compile" meaning "resolve to a shared version, but don't compile"

Ah, in this case I just meant: "the dependency resolver succeeds, but the resulting dependency graph fails to compile", which is a general class of errors I think we can eventually avoid (with some work).

wycats commented 8 years ago

Anyway, yeah, I think this is all violent agreement :)

Yay! :tada:

erickt commented 8 years ago

@kmcallister apparently just implemented a pure rust sat solver. Maybe it could be used here?

graydon commented 8 years ago

Sadly, that's just an interface to an external SAT solver (or a family of them -- those that speak DIMACS input format), not a solver itself. It's similar to the thing I built to talk to Z3, except running out-of-process and just solving, not optimizing. Z3 also speaks DIMACS input format, but again, that format doesn't express optimization problems so it's kinda not ideal (you have to run the solver in a loop crawling towards an optimal solution). The semver test I put in the Z3 repo offloads all the optimizing to the solver itself.

nixpulvis commented 8 years ago

Even if the real solution to this is hard, can we make some warnings or something in common cases. I just wasted like 4 hours not realizing I had 2 versions of serde in my project causing a trait not implemented error.

schuster commented 8 years ago

Is there something an outsider like myself can do to help move this issue forward? Perhaps finding an appropriate solver to use internally, or running the shared-only experiment on crates.io that @graydon suggested, or even just implementing a warning like @nixpulvis mentioned?

sfackler commented 8 years ago

I certainly think a warning when updating the dependency graph would be appropriate at least in the short term. I'd probably phrase it as a "by the way" kind of message rather than "OH NO" kind of message since there are may cases where multiple versions can and do coexist peacefully. It pops up pretty frequently with things like bitflags and byteorder.

Further experiments with solvers and objectives also seems like some work that would be great as well. A cargo-update2 or whatever binary crate that could be used in place of normal cargo update would be a great way to experiment with this kind of thing.

schuster commented 8 years ago

Great! I'll start on the warning message to get familiar with the codebase. Once that's done, I can start looking at some of the more complex parts, like using external solvers.

viraptor commented 8 years ago

While waiting for cargo itself get something like that, I submitted a PR to cargo-tree which at least shows such issues (and where they come from). A quick debug tool for people affected by this issue.

rix0rrr commented 8 years ago

It would be even more awesome if Cargo could detect that a crate publicly exposes types from a dependent crate. Let's call a dependency that re-exports part of the crate a "public dependency".

If so, then the versions of the depended-upon package should be the same between all packages that publicly depend on it, in the part of the dependency tree that is bounded by the absence of public dependencies on that package (either because the dependency is private, or because there's no dependency at all).

That would solve the diamond dependency problem in the situation where multiple crates depend on a shared package, let's say time, and they expose part of that crate's API in their own API. Cargo would then be able to select an appropriately matching single version across all packages that need to share data structures from a crate, yet be free to pick a different version in a different part of the dependency tree.

Of course, the shared version that is picked is subject to the union of all version constraints.

gyscos commented 8 years ago

More than just versions, the feature set of each crate is another property that better identical when shared. Though in this case, cargo should just give an error/warning message, as we probably shouldn't randomly add/remove features just to fit the dependency graph...

joepie91 commented 8 years ago

Just pitching in here after a somewhat lengthy discussion about nested and flat dependency models in #rust-beginners, that eventually led me to this thread.

Assuming that there is no better solution to make two otherwise non-compatible versions of a library interact with each other - I'm very new to Rust, so I have no idea if there is - @rix0rrr's suggestion of analyzing for exposed dependencies seems like a good idea to me.

The ability to internally use conflicting versions of dependencies greatly helps to build a modular ecosystem, as it removes the exponential increase of "dependency conflict risk" for every dependency you add to your project - something that has historically led to the emergence of monolithic libraries and frameworks (and a lot of wasted effort as a result).

I understand that Rust isn't quite as 'loosely defined' as eg. JavaScript and so it can be hard to implement a dependency model like that of NPM directly, but making a distinction between 'public' and 'internal' dependencies seems like it'd provide the best of both worlds.

Aside from this, another topic discussed on IRC was how to deal with compatible structs across otherwise incompatible versions of dependencies. This idea already have been implied in the posts above, but I'm going to spell it out anyway, if only for the benefit of other beginners stumbling upon this thread:

Say that there is a dependency called awesome_time that exports a Time struct, and also provides a from_unix_timestamp method. Now say that there is an awesome_time@1 and awesome_time@2 version - the major version bump is due to the signature for from_unix_timestamp having changed, but the Time struct is still the same.

Obviously if different libraries use different major versions of awesome_time, they will still be dealing with the same Time struct signature, and the user will want to be able to exchange Time structs between different versions of awesome_time. The solution in this case would be to not include the Time struct into the awesome_time library at all, but rather publish it separately as an independently versioned package (time_struct), and making both awesome_time versions depend on it.

Combined with the suggestion of analyzing "public/exposed" vs. "internal" dependencies, this could result in the following conclusion:

time_struct is an exposed dependency, and only one version may exist.
awesome_time@1 and awesome_time@2 are internal dependencies, and can safely coexist.

In this case, even if "exposed dependency conflicts" were explicitly disallowed, things would still work out fine.

This would not cover the case of time_struct gaining a major version bump, but that would be a relatively rare case - and can perhaps be covered by analyzing where dependencies are exposed, and allowing conflicting exposed dependencies as long as they exist in isolated parts of the application.

I've rambled on a bit and this might be too complex to implement in the short term (if at all possible - again, I'm very new to Rust, so the constraints of the language and compiler are not entirely clear to me yet), but I figured it'd be worth bringing up.

alexcrichton commented 8 years ago

@joepie91 thanks for the writeup! That all sounds like it's aligned with the plan for an eventual set of public dependencies and private dependencies, which this issue is sort of a misleading placeholder for at this point :)

joepie91 commented 8 years ago

@alexcrichton Hehe, indeed. Unfortunately, it does not appear that GitHub has a way to have a "current thread status summary" feature, short of shoehorning it into the initial comment. That might be a good idea, though, even if just as a brief indicator that the focus of the thread has shifted?

Or perhaps branch off into a new issue specifically about the distinction between exposed/internal dependencies? For what it's worth, I do think "exposed/internal" would be clearer terminology than "public/private", as the latter might imply either secrecy (like a "private upload") or the ability/need to explicitly define it (like a "public member").

alexcrichton commented 8 years ago

@joepie91 good point, I've updated the title and description.

brson commented 8 years ago

@alexcrichton Are you actively working on this P-high bug?

alexcrichton commented 8 years ago

@brson not currently, I'll remove from P-high for now.

cc @wycats

jeffparsons commented 8 years ago

Just throwing in my 2c worth of bad experience with this issue:

A couple of days ago, I ended up building multiple versions of hyper, because iron re-exports a bunch of its types, but I also wanted a macro that is only exposed by hyper itself. I unwittingly used a newer version of hyper than the one my version of iron was re-exporting from, and ended up spending half an hour chasing a spurious type mismatch because I was trying to mix types from different versions of hyper.

I initially assumed that I had misunderstood the semantics of re-exports, because rustc was effectively telling me "these are different types" without understanding the context of why they were different types. I thought I was going crazy.

kornelski commented 8 years ago

Before a proper solver is integrated, could this be partially solved with a heuristic? or deduplication after resolving?

Currently for me it fails on a trivial case where there is only one "conflicting" package and >= 0.2.0 and 0.4.0 are resolved to two different versions.

alexcrichton commented 8 years ago

@pornel I do believe so, yes, but it may be pretty non-trivial to implement (haven't thought too deeply)

gwillen commented 8 years ago

I'm just chiming in because I recently discovered that cargo will silently compile multiple versions of the same crate into the same binary, and this feels weird and scary to me, and I see that a warning was proposed but rejected.

I'm concerned that it doesn't seem like people are necessarily all on the same page with respect to @graydon's objection, which I (a total Rust outsider, I grant you) completely agree with -- it's not necessarily safe to link a crate twice EVEN if the linkage is private, and if it's not safe, the resulting bugs will be very subtle at best.

I like @graydon 's example of "two instances of a library that both think they are in charge of talking to OS subsystem / native library foo", which is disastrous EVEN if there's no type clash for the linker to detect because both copies are private. The best example I could come up with was a logging crate -- suppose that it by default opens a logfile whose name is, say, the name of the binary it's in, plus some other stuff (the string ".log", the current time, the pid, etc.) Then two private copies of the crate will open the same file, and the resulting behavior will be the worst kind of totally inexplicable, and probably someone will tear their hair out for weeks trying to debug it.

The pull request for the rejected warning mentioned that the sys crates are "highlanders" -- they enforce non-duplication -- and that servo has a tool for catching duplicate crates. This seems like evidence to me that some important cases require avoiding this currently-unavoidable behavior.

jethrogb commented 7 years ago

So there's been some breakage this weekend with the release of openssl-sys 0.9.0, due to the requirement that native libraries are linked by only one crate. Any package that depends on git2 ^0.4 no longer builds out of the box due to two versions of openssl-sys being pulled in. Can we get a SAT solver already pretty please? :)

sfackler commented 7 years ago

Some interesting discussion of these issues: https://research.swtch.com/version-sat

gwillen commented 7 years ago

A question about the openssl-sys case -- do we actually need a SAT solver to fix it, or would ad-hoc adjustment to use the highest version of the dependency have been sufficient? (That is, did the package depending on the lower version specifically exclude the higher?) It sounds from @sfackler's link like other systems may perform this ad-hoc adjustment (and that it is in fact necessary to solve the non-NP-complete case correctly.)

jimmycuadra commented 7 years ago

Any progress on this in 2017? It just came up again for one of Ruma's users who ended up with three different versions of a public dependency in their program and resulted in confusing error messages. Does this issue fit into any of the 2017 roadmap issues? Better error messages from the compiler, which make it clearer that there are conflicts between versions, could go a long way in helping new users figure out what the problem is before there is a more formal solution to public/private dep management with Cargo.

Mark-Simulacrum commented 7 years ago

I believe https://github.com/rust-lang/rfcs/pull/1977 is being relatively actively worked on.

rust-lang / cargo

Cargo should understand distinctions between public/private dependencies #2064

Current report

Original report