rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.76k stars 2.42k forks source link

"cargo install" apparently ignores "Cargo.lock" as opposed to "cargo build" #7169

Open df5602 opened 5 years ago

df5602 commented 5 years ago

I've encountered a strange issue today.

Background: The percent-encoding crate recently made a new major release. Knowing that stuff could break I held off on updating the crate because I haven't yet had time to check what changed.

Problem When I compile my binary crate using cargo build (with and without --release) everything builds fine. However, if I compile my binary using cargo install --path . -f, compilation fails with an error related to percent-encoding (the breaking change warranting the major version bump I assume..). In the output I see that it compiled percent-encoding v2.0.0 instead of percent-encoding v1.0.1 as specified in Cargo.lock. (Note: The Cargo.toml doesn't specify a specific version.)

The crate is the following, if you want to try at home: https://github.com/df5602/bingers

Steps

  1. $ cargo clean
  2. $ cargo build
    ...
    Compiling percent-encoding v1.0.1
    ...
    Finished dev [unoptimized + debuginfo] target(s) in 48.32s
  3. $ cargo build --release
    ...
    Compiling percent-encoding v1.0.1
    ...
    Finished release [optimized] target(s) in 1m 25s
  4. $ cargo install --path . -f
    
    Installing bingers v0.1.0 (/path/to/bingers)
    Updating crates.io index
    Compiling percent-encoding v2.0.0
    Compiling bingers v0.1.0 (/path/to/bingers)
    error[E0432]: unresolved import `percent_encoding::QUERY_ENCODE_SET`
    --> src/tvmaze_api.rs:10:45
    |
    10 | use percent_encoding::{utf8_percent_encode, QUERY_ENCODE_SET};
    |                                             ^^^^^^^^^^^^^^^^ no `QUERY_ENCODE_SET` in the root

error: aborting due to previous error



I can work around it by specifying an explicit version in `Cargo.toml`, but the observed behaviour was surprising to me...

**Notes**

Output of `cargo version`:

`cargo 1.36.0 (c4fcfb725 2019-05-15)`

OS: Mac
est31 commented 2 years ago

It's extremely annoying when there is chains of dependencies each doing a pre-release, and depending on the pre-release of the following one. Because then, basically any delay in API breaking upgrades along that chain causes breakages for users at the end, unless they have a checked in lockfile. The rust-windowing crates used to do this a few years ago and it was extremely annoying for me as downstream. The only remedy is using = but the entire chain has to do that.

Maybe this is a documentation issue, but it's one of those cases that are easy to get wrong and easy to cause lots of downstream damage, so one can argue that it's more than that.

djc commented 2 years ago

Please, everyone, let's not confuse these two issues:

Commenting more with your experience or how this is bad for making robust binary releases on crates.io or for supply chain security does not help at this point, as those issues are well understood by the Cargo team. See earlier comments from a team member in https://github.com/rust-lang/cargo/issues/7169#issuecomment-613034396 and https://github.com/rust-lang/cargo/issues/7169#issuecomment-539233845. If someone wants Cargo to make progress on this issue, please make a credible proposal trading off the subtleties for this issue. Your proposal should likely have a way to (a) avoid a flag day and enable both carefully phasing this in, changing the default at a later date and (b) a way to allow maintainers to opt in to the new behavior sooner, and maybe also (c) just some hints in Cargo output when one of the unexpected thing happens.

est31 commented 2 years ago

@djc you are right, the last discussion was, while still in the general "how to avoid breakage for users" setting, offtopic.

My proposal is upthread, in the second paragraph of this comment, it got only one response as far as I can tell. The basic idea is to have some mechanism of crates.io running a CI for crates that updates their lockfiles. Not the ones in the .crate file, as that would invalidate the hash, but a separately maintained one that cargo can download.

The advantage is that non-breaking security updates would still arrive in the ecosystem, while minimizing breakage.

Building each binary crate each day would be a massive and mostly wasteful effort, but maybe the build/check frequency could change with a crate's popularity. A crate that gets downloaded less than one time per week does not need daily updates, and I'd wager this is most crates. On the other hand, my cargo-udeps crate gets hundreds of downloads per day (probably most of it non-caching CI usage), so giving it more regular updates than some crate that gets 1 download per week makes sense. There are crates with even more daily downloads.

Designing a CI system is a complicated major effort, but Rust already runs CI on crates regularly through crater, as well as through docs.rs.

There would also be the need for some kind of notification mechanism if there is a build failure. Basically replacing user reports with automated ones, but at least then it doesn't create such a bad situation for all users.

The feature is a large design space and probably deserves its own RFC. E.g. one could think about running the test suite, or not running it, or running it in a custom way, or using custom operating systems like Windows, or maybe withholding updates of special crates as they are known to be broken but are being worked at and one wants to enable updates while this occurs...

jonhoo commented 2 years ago

Might it make sense to add a boolean ignore-lockfile field to [install] in the Cargo configuration file that defaults to true but that users can change to false? And that could potentially have its default flipped at an edition boundary?

Note that this wouldn't help publishers at all, but it would at least allow users to choose their desired behavior so that adoption can happen piecemeal.

masaeedu commented 1 year ago

Not sure if this experience report is useful to anyone, but as someone not heavily involved in the Rust ecosystem and just trying to install a useful tool via cargo install --git ..., this was very puzzling to debug.

EDIT: My bad, I didn't see the earlier comment re: "Commenting more with your experience ... does not help at this pont ..." since it was hidden. Apologies for the noise.

epage commented 9 months ago

This came up in passing in Office Hours and a case come up that seemed intriguing. For commands centered on the cargo workflow or for casual programs, the current default makes sense. For programs more generally used, like ripgrep, they have other release channels in addition to crates.io that they have to do a new release for anyways when new dependencies come out that affect users.

bjorn-ove commented 9 months ago

For commands centered on the cargo workflow or for casual programs, the current default makes sense.

What do you mean by this? The current default will sometimes cause compile errors and other issues when new versions of dependencies are brought in. Is this a good idea for non-advanced users?

RalfJung commented 9 months ago

The current default will cause compile errors

"may cause" would be more accurate. Such issues are rather rare. I can't remember the last time I ran into this.

intgr commented 9 months ago

Such issues are rather rare.

Pick one:

🤔

RalfJung commented 9 months ago

Don't just ignore the arguments mentioned upthread for why a --locked default can cause issues. That's not the way to have a proper discussion.

This is a non-trivial tradeoff, and outright dismissing the other side's arguments and non-existent isn't going to help anyone.

bjorn-ove commented 9 months ago

For me, the surprising behaviour is not really when I install from crates.io. It is when I run cargo build or cargo run on a local repository, it isn't the same as when I run cargo install --path .. It just feels counter intuitive.

epage commented 9 months ago

Sorry, that was not clear. Maybe better as "the existing trade offs apply for the cargo install side". More so my point was that the problem I was introducing is neutral for cargo plugins and casual releases but applies for all other tools.

casey commented 8 months ago

I ran into this today. The issue wasn't the build breaking, but cargo install unexpectedly downloading and building newer versions of dependencies when it should have been using cached build artifacts. I was very confused by the default behavior of cargo install.

I think it would be better if cargo install defaulted to using the lockfile, and a new flag was added that enabled the current behavior, something like cargo install --update-dependencies.

I ran into this again today, when a user reported that cargo install just was failing due to an update to a dependency which introduced a build failure. I was able to fix it for the latest version by pinning the dependency to the last working version. However, this issue persists in 95 old versions of just, and backporting the fix to all of them is infeasible, so they will continue to be broken indefinitely.

There are definitely tradeoffs here, and they've been well enumerated in this thread. I don't mean to comment again without new information, but, given the amount of issues that have been caused by not respecting the lockfile by default, I feel like the tradeoffs are strongly in favor of respecting the lockfile by default.

If you search GitHub for "cargo install --locked", you'll find 6.2k results, and if you search for "cargo install --locked" path:*.md, restricting the search to markdown files, which are probably documentation, you'll find 2k results. This suggests that crate authors document cargo install --locked as the preferred way to install their crates, as well as extensively using cargo install --locked in installation scripts. This negates much of the benefits of not respecting the lockfile by default, since it seems to be very common practice to opt-out of the default behavior.

What needs to happen to drive this issue forward? I don't want to trivialize the issue, but I feel like it would be a net positive to simply change the default back to the pre cargo 1.37 behavior.

I think that the main benefit to not respecting lockfiles is for downstream packagers, e.g., linux distros and package managers like homebrew, and that those downstream packagers are in a good position to evaluate the tradeoffs appropriately, and opt-in to not respecting the lockfile. Especially since those downstream packagers often have continuous integration pipelines that check for build failures, and that they are likely to be able to report the issue to the correct project, i.e., the broken dependency, and not the parent crate, which is where issues are currently usually reported.

plwalsh commented 8 months ago

@casey Searching the markdown usage was brilliant, and very telling. Thank you for your well-formed comment.

RalfJung commented 8 months ago

If you search GitHub for "cargo install --locked", you'll find 6.2k results, and if you search for "cargo install --locked" path:*.md, restricting the search to markdown files, which are probably documentation, you'll find 2k results. This suggests that crate authors document cargo install --locked as the preferred way to install their crates, as well as extensively using cargo install --locked in installation scripts. This negates much of the benefits of not respecting the lockfile by default, since it seems to be very common practice to opt-out of the default behavior.

It suggests that some crate authors do that. But to actually put that number into perspective we'd need some sort of baseline. Is this 5% of crate authors, 50%, 95%? We have no way of knowing.

bjorn-ove commented 8 months ago

I just noticed this blog post from August 2023 where the guidance on how to deal with Cargo.lock changed.

May be relevant

https://blog.rust-lang.org/2023/08/29/committing-lockfiles.html

taiki-e commented 8 months ago

If you search GitHub for "cargo install --locked", you'll find 6.2k results, and if you search for "cargo install --locked" path:*.md, restricting the search to markdown files, which are probably documentation, you'll find 2k results. This suggests that crate authors document cargo install --locked as the preferred way to install their crates, as well as extensively using cargo install --locked in installation scripts. This negates much of the benefits of not respecting the lockfile by default, since it seems to be very common practice to opt-out of the default behavior.

It suggests that some crate authors do that. But to actually put that number into perspective we'd need some sort of baseline. Is this 5% of crate authors, 50%, 95%? We have no way of knowing.

FWIW, all "cargo install" in markdown files is 52.2k ("cargo install --locked" is 2k): https://github.com/search?q=path%3A*.md+%22cargo+install%22&type=code

Also note that these search results miss the cases where there are other flags (or spaces) between "cargo" and "install", and "install" and "--locked". (e.g., my CLI projects recommends "cargo +stable install ... --locked")

casey commented 8 months ago

I opened a PR proposing to change cargo install to respect lockfiles by default. Copying my comment there:

Rendered

Currently, cargo install does not respect lockfiles by default, which causes breakages when dependencies release breaking but semver-compatible versions. This RFC proposes to change cargo install to respect lockfiles by default.

There's been a lot of discussion on https://github.com/rust-lang/cargo/issues/7169 without much progress, so I thought it would be productive to open an RFC proposing changing the default behavior of cargo install to respect lockfiles by default, to provide a concrete proposal that can be discussed.

Even if this RFC is ultimately rejected, it will provide some forward progress on the issue, since we can then focus on an alternative to changing the default behavior.

I think the last point is key. If the RFC is rejected, we can then all think hard about a solution to the issue aside from changing the default behavior, perhaps introducing a --unlocked flag, and making one of --locked and --unlocked mandatory.

charles-r-earp commented 8 months ago

At the very least, Cargo and crates-io could provide more of a hint to users to use --locked when the package has a lockfile.

ckcr4lyf commented 8 months ago

FWIW, I do think using locked packages should be the default behavior.

As a maintainer of some small binary packages, I know the versions of dependencies used work exactly as expected, at least (in admittedly limited) the cases I've tried / intended for them to be used.

Regradless of the fact the "real semver" should "never break" - let's be real, it is very possible, and it is a pretty bad experience if now my package can't be installed because of a dependency. If I am not mistaken, in Node.JS, you can use a shrinkwrap (similar to lockfile) to force pinned dependencies, to ensure users effectively have "reproducible builds" - including bugs if any.

Furthermore, somewhat based on Hyrums Law , maybe my program unintentionally relies on a bug in a dependency (at least viewed as a bug by the dependency's author) - and if they decide to fix it in a patch version then it could break my application.

There's also the extreme case of "protestware", e.g. author intentionally introduces a breaking change in patch version. Now sure, this would eventually get caught by cargo mods (I am not entirely sure whom), but if I want to protect my users, I would want my lockfile to be respected.

jonhoo commented 8 months ago

One important aspect of this conversation is the knock-on effects of this kind of change. While it's true that this will lead to fewer failed builds due to accidental semver violations, it will also make it more likely for semver violations to not be caught, and thus to proliferate. In other words, what we're trading off is reduced user friction in the face of semver violations (build more likely to succeed out of the box) against more outdated dependencies and lower awareness of semver violations (both instances of them and the general concept).

I think any decision needs to start with an acknowledgement of this tradeoff, and then an explicit rationale for choosing one set of benefits + costs over those of the other options.

Personally, I think this is a case where the occasional user frustration is unfortunate, but better than the alternative. I'd be sad to see the semver violation learning opportunity disappear, and I'd be sad to see users end up (unknowingly) with older dependencies than are available at install time.

chrisduerr commented 8 months ago

Detecting semver violations really shouldn't be up to your users.

ckcr4lyf commented 8 months ago

I'd be sad to see the semver violation learning opportunity disappear

I'd argue the opportunity is still there - but violations would be discovered by maintainer of a package when they do a dependency update, rather than the end user.

As an example, in a node project I had used [rolling-rate-limiter]() , and locked the version. As a maintainer, I can use npm (Node's cargo , kind of) to view outdated dependencies, and if I want, update them and release a new patch version of my project. This would automatically update semver compatible dependencies due to my explicit action .

In doing so, I discovered a breaking change in their patch version (see: https://github.com/peterkhayes/rolling-rate-limiter/issues/64) , raised the issue, and the maintainer acknowledged the error, and I think they did learn something.

epage commented 8 months ago

At the very least, Cargo and crates-io could provide more of a hint to users to use --locked when the package has a lockfile.

Packages published with bins always have a lockfile.

We do suggest --locked when a dependency is found that isn't compatible with the toolchain you are using. I think it would be a good quality of life improvement to also suggest --locked if there is a compilation error. Even better if we can limit the suggestion to non-local dependencies. I also suspect we should print the project's homepage URL on error to help connect users with where to go for support.

est31 commented 8 months ago

I still stand by my idea from four years ago of an online service that maintains a lockfile (as the default behaviour, obviously there should be overrides for both latest lockfile and --locked): https://github.com/rust-lang/cargo/issues/7169#issuecomment-614343884

It doesn't need to immediately update the crates.io maintained Cargo.lock at every single update, but if it makes an attempt, it only updates the crates.io maintained Cargo.lock if some basic checks succeeded, like cargo check. This is not a full replacement of running the testsuite or manual verification, but it's way better than what we have now where builds are breaking. Don't let perfect be the enemy of the good :).

If after an attempt to build, the build fails, it maybe can show a warning to everyone running cargo install that the last Cargo.lock update on 2023-12-03 failed so it still uses the version from 2023-11-15, with a link to the build failure log and the request to maybe contact maintainers about it.

We already run crates.io wide compilations in multiple settings like rustdoc and crater. Admittedly, unlike those two earlier methods it would build at a higher cadence, but for costs we only care about the average cadence of updates. crates that see regular downloads and recent releases can get treated differently than those with few downloads, security fixes differently from normal ones.

If the default cadence was 6 weeks, and the top 10% by downloads get 2 weeks, and the top 5% get 1 day, we'd still have an average close to 6 weeks, so roughly the same as the cost we already spend on crater, if we built every crate, but we'd only run this on the minority of binary crates.

What would also be useful would be ways for crate authors to manually re-trigger the update process, say after some fix was released, and another way to revert to the previous Cargo.lock if there was some breakage.

We can also fizzle out updates google play style to exponentially increasing numbers of users, resembling a sigmoid curve, that way if there is some other breakage, it's not immediately broken for everyone.

est31 commented 8 months ago

[a] way to revert to the previous Cargo.lock if there was some breakage.

one can make a version of this without crates.io maintained Cargo.lock: right now, binary authros are fully dependent on the dependency maintainers [sic] to release a fixed version of a crate. What if binary authors could issue (automatically expiring) version pins for a dependency of their binary crate? You can already now do something like that by depending on the dependency directly in your Cargo.toml in a way that it is pinned, but it's a bit unwieldy and requires an entire release.

chrisduerr commented 8 months ago

@est31 Having a lockfile that automatically gets updated is just the worst of both worlds. The reason lockfiles are great because it ensures the output binary behaves exactly the same, not just that the output compiles. If you updated them automatically you'd just run into more esoteric issues that are essentially the same.

If I'm uploading a binary to crates.io, I want people to run exactly what I released. I do not want any dependencies updated, even if those are "security" updates. That should be up to me.

ssokolow commented 8 months ago

If I'm uploading a binary to crates.io, I want people to run exactly what I released. I do not want any dependencies updated, even if those are "security" updates. That should be up to me.

You are aware, I hope, that there are already a lot of people who have a low opinion of distribution channels that aren't Linux distro packages because of how much responsibility they place on the upstream maintainer to respond promptly to security advisories.

(i.e. "Look at how badly Docker failed on ensuring a steady supply of security updates! Flatpak and statically linked languages like Go and Rust are evil and should not be allowed!")

...and Cargo.lock doesn't have the dodge Flatpak does, that the things which are most likely to need security updates are in the runtime, not the package.

I don't think there's a simple solution to reconcile the two conflicting needs here.

Not to mention that we don't know how places that focus on giving the downstream the tools to protect themselves (eg. lib.rs) would react to packages with stale lockfiles if something like this doesn't include a mechanism for allowing Crates.io to overrule Cargo.lock pins for packages when the maintainer is going through a bad bit and isn't in a state of mind to push an update that would have just gone through without a Cargo.lock.

We could easily wind up with packages getting tarred with a bad reputation because they don't meet some externally imposed standard for how quickly security fixes must be rolled out... heck, imposing unreasonable demands for prompt security response on hobby projects was one of the things that sparked a lot of fear in the earlier drafts of the E.U.'s new rules for safer software infrastructure.

chrisduerr commented 8 months ago

You are aware, I hope, that there are already a lot of people who have a low opinion of distribution channels that aren't Linux distro packages because of how much responsibility they place on the upstream maintainer to respond promptly to security advisories.

Yes, that includes myself. But this just makes it worse, not better. If you're asking me, then this does nothing (or close to nothing) for security but has significant impact on reliability. You can't just cargo update some random abandoned project and pretend that suddenly makes it secure and up-to-date.

And it's not like cargo has a good way of updating binaries installed through it anyway. So the people most likely to be using outdated dependencies, even if upstream has already pushed a new lockfile, are the most vulnerable. Not the ones just installing it.

We could easily wind up with packages getting tarred with a bad reputation because they don't meet some externally imposed standard for how quickly security fixes must be rolled out... heck, imposing unreasonable demands for prompt security response on hobby projects was one of the things that sparked a lot of fear in the earlier drafts of the E.U.'s new rules for safer software infrastructure.

Updating the lockfile changes almost nothing about this. Security issues are introduced in binaries regularly, not just libraries. And it's also possible that a breaking change is required to fix them. If people don't update their software, it's not safe to use, that's just the nature of the beast.

ssokolow commented 8 months ago

You can't just cargo update some random abandoned project and pretend that suddenly makes it secure and up-to-date.

As a maintainer of some projects I don't yet feel are ready to push to crates.io, I find that GitHub Dependabot is quite good at offering me dependency security fix PRs that pass all my GitHub Actions runs while only doing the kind of semver bumping that cargo install does implicitly.

(So long as the dependency in question actually offers a minor version update for Dependabot to generate a security-update PR from, of course.)

Updating the lockfile changes almost nothing about this. Security issues are introduced in binaries regularly, not just libraries. And it's also possible that a breaking change is required to fix them. If people don't update their software, it's not safe to use, that's just the nature of the beast.

That smells like a "Perfect solution or status quo. There's no such thing as an incremental improvement." argument to me.

...especially in projects like mine, where I'm generally cautious enough about how I design my things that any vulnerabilities are far more likely to be in my dependencies than in how I'm gluing them together.

In the real world, people install unmaintained software. It's just human psychology that "I need this solved now" is a more significant psychological input than "there might be a hidden flaw in this which could bite me later".

If anything, the current cargo install behaviour could be considered safer because, not only does it provide some level of protection against stale dependencies, it's more likely to turn "unmaintained" into "adding a speed-bump for users who might otherwise blindly install unmaintained software".

charles-r-earp commented 8 months ago

At the very least, Cargo and crates-io could provide more of a hint to users to use --locked when the package has a lockfile.

Packages published with bins always have a lockfile.

We do suggest --locked when a dependency is found that isn't compatible with the toolchain you are using. I think it would be a good quality of life improvement to also suggest --locked if there is a compilation error. Even better if we can limit the suggestion to non-local dependencies. I also suspect we should print the project's homepage URL on error to help connect users with where to go for support.

It is nice that cargo does include a hint to try using --locked when install fails, but I was suggesting that this be a warning upfront. And in the case of crates-io, it suggests to do cargo add or add the package to the manifest, which is of course a current issue. So it would be nice if when that was fixed, in addition to suggesting to install the crate, it could hint that the package has a lock file that can be used via --locked.

casey commented 7 months ago

Since concrete instances of issues stemming from not respecting lockfiles are well understood and documented, it would be useful to collect information on concrete instances of the benefits of respecting lockfiles by default.

It would be very helpful to know:

For either of these to be evidence that not respecting lockfiles is beneficial, it isn't enough for a crate a dependency to have a bug or security vulnerability, and the semver-compatible version that is selected by cargo install to not have the bug or security vulnerability. Crates often use only a subset of the features of dependencies, a bug or security vulnerability might exist in a dependency that cannot be encountered when using the compiled binary. So, it must be possible to encounter the bug when using the dependent.

casey commented 7 months ago

Other things which would be useful to know:

ckcr4lyf commented 6 months ago

There's also the extreme case of "protestware", e.g. author intentionally introduces a breaking change in patch version. Now sure, this would eventually get caught by cargo mods (I am not entirely sure whom), but if I want to protect my users, I would want my lockfile to be respected.

Commenting on this point a bit more with the whole xz backdoor thing - if my binary package relies on a lib that got social engineered into someone being able to introduce a backdoor (and publish it as patch version), the current default cargo behavior would end up "upgrading" to the backdoored file if one of my users did cargo install without --locked.

I think this is another good case for locked to be the default, if the author of a project has a lockfile in place.

ssokolow commented 6 months ago

I still think studies need to be done to calculate how often the current behaviour pushes out security updates faster than the dev updates their lock files vs. how often it prevents an exploit from being rolled out.

The xz backdoor thing is certainly dramatic, but we don't want to succumb to what I'll call argumentum ad dramatum for now. Reasoning like that leads to people being so scared of dying in an airplane that they discount how you're much more likely to die in the car on the way to the airport.

Darkcashi86 commented 5 months ago

Como encuentro una dirección

olson-dan commented 2 months ago

I saw that there was a related RFC but it seems this issue is still somewhat active. I ran into significant problems with this trying to use cargo install from a docker container build step against an older Rust version.

It is my belief that the current behavior is broken. Having done some small spelunking it appeared that it was only chosen back in 1.39 because there was a desire not to change away from the previous behavior. At this point it's clearly correct to invert the default and provide an --unlocked option, but since this is controversial I want to propose an alternative:

cargo install should default to the --locked behavior when specifying a specific version with the --version argument. This gives users just running it from command line to install software access to both the security fixes and semver breakages they know and love, while giving people trying to install an older version for any reason some slim hope of a compile that succeeds.

I want to briefly mention that the problem of building an older version is compounded by MSRV which is not respected by cargo when choosing the specific dependency to build against. That's a separate issue. I only mention it because the specific task of building anything against an older Rust version is a major pain point with the tooling right now, and making this small change to cargo install will go a long way towards helping.