rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.59k stars 2.39k forks source link

Cargo package silently ignores patch entries when additional binaries are present #11181

Open leighmcculloch opened 1 year ago

leighmcculloch commented 1 year ago

Problem

Running cargo package --no-verify successfully packages crates in a workspace using the [patch.crates-io] entries when validating the Cargo.toml file.

Running the same command fails in a workspace that has a crate with [[bin]] in its Cargo.toml file, or an additional binary in a src/bin/... file because the [patch] entries get silently ignored.

Steps

  1. git clone https://github.com/stellar/rs-soroban-env

  2. cd rs-soroban-env

  3. git fetch origin pull/530/head:pr530

  4. git checkout pr530

  5. cargo package --no-verify -p soroban-env-host # will run without contacting crates.io.

    ❯ cargo package --no-verify -p soroban-env-host
    Packaging soroban-env-host v0.0.6 (/Users/leighmcculloch/Code/rs-soroban-env/soroban-env-host)
  6. git checkout HEAD~1 # go back one commit

  7. cargo package --no-verify -p soroban-env-host # will fail after contacting crates.io and not finding a workspace crate dependency not published.

    ❯ cargo package --no-verify -p soroban-env-host
      Packaging soroban-env-host v0.0.6 (/Users/leighmcculloch/Code/rs-soroban-env/soroban-env-host)
       Updating crates.io index
    error: failed to prepare local package for uploading
    
    Caused by:
     failed to select a version for the requirement `soroban-env-common = "^0.0.6"`
     candidate versions found which didn't match: 0.0.5, 0.0.4, 0.0.3, ...
     location searched: crates.io index
     required by package `soroban-env-host v0.0.6 (/Users/leighmcculloch/Code/rs-soroban-env/soroban-env-host)`

The only difference between the two commits is the removal of the additional binary.

Possible Solution(s)

This problem seems similar to:

Cargo package should use the patch entries when validating the toml regardless of whether there are [[bin]] or additional binaries present.

Notes

This is an important capability because it allows a workspace to package its crates that have interdependencies without those crates being published, by allowing those crates to specify [patch.crates-io] path entries.

Version

cargo 1.64.0 (387270bc7 2022-09-16)
release: 1.64.0
commit-hash: 387270bc7f446d17869c7f208207c73231d6a252
commit-date: 2022-09-16
host: aarch64-apple-darwin
libgit2: 1.4.2 (sys:0.14.2 vendored)
libcurl: 7.79.1 (sys:0.4.55+curl-7.83.1 system ssl:(SecureTransport) LibreSSL/3.3.6)
os: Mac OS 12.5.1 [64-bit]
leighmcculloch commented 1 year ago

Note that the reproductions steps above depend on soroban-env-common v0.0.6 not being published, but I am actually just about to publish it. I can circle back later and update the reproduction steps with a different repo demonstrating the same problem.

ehuss commented 1 year ago

Cargo package should use the patch entries when validating the toml regardless of whether there are [[bin]] or additional binaries present.

Unfortunately I think the intended behavior is the opposite of this. When generating the Cargo.lock for publishing, it needs to generate one that will work for users fetching from crates.io. Cargo intentionally removes [patch] from Cargo.toml. Without the patch, if a dependency points to a package that doesn't exist, then it will fail.

Without [[bin]] I believe cargo publish would fail since crates.io would reject the missing dependency.

It does seem like it should maybe validate that the upload will succeed even without a [[bin]] even if --no-verify is specified. I'm not certain because --no-verify does imply that cargo shouldn't really be checking if the package is actually valid. If there is a [[bin]], I don't think there is an alternative because it has to generate a valid Cargo.lock.

It's not clear to me what your underlying use case is. If you could maybe say more about why you are running cargo package in the first place, we could better understand what your workflow is and perhaps direct towards something different.

leighmcculloch commented 1 year ago

As I understand there are two places that the Cargo.toml comes into play when packaging:

  1. When initially loading/validating the tool file. This happens before the package is created.
  2. When verifying the package. This happens after the package is created with its [patch] entries removed.

Is my understanding accurate of these ☝🏻?

I believe that the [patch] entries should only be omitted for that second step, where the completely built package is being verified that it will build once published.

I believe that for that first step Cargo should be behaving like it does for any other command: It should be reading the manifest to learn about the package that it is going to take into the job it has to do.


About my specific use case:

Tl;dr: I package a group of crates that live in a workspace and have interdependencies on each other. I package them first, then publish them later.

The long version is that I package them first before publishing as part of building a directory-source to use as source-replacement. Once the directory-source is constructed I use it as source-replacement for crates.io and run package verification. The script/workflow is here: https://github.com/stellar/actions/blob/main/.github/workflows/rust-publish-dry-run.yml. I do this to verify interdependent workspace crates before publishing.

ehuss commented 1 year ago

Is my understanding accurate of these ☝🏻?

There are a few steps in-between. https://doc.rust-lang.org/cargo/commands/cargo-package.html#description has a user-level description, but in general it is:

  1. Load the workspace (like any other command), which involves parsing all the Cargo.toml files in the workspace. However, this only does basic validation (it does nothing with dependencies and patches).
  2. Rebuilds Cargo.toml to strip out things like [patch].
  3. If there are any binaries, it runs the resolver against the stripped-down Cargo.toml (using the existing Cargo.lock if it exists) to build a stripped-down Cargo.lock that contains the entries only needed for that package.
  4. Creates the .crate file.
  5. Extracts the .crate file and tries to verify that it can build. This uses the stripped-down Cargo.toml without patches (and the rebuilt Cargo.lock if there are bins).

Cargo has to be able to run the resolver to generate the Cargo.lock file, and that must run against the stripped-down Cargo.toml without patches so that the Cargo.lock is correct for publishing.

leighmcculloch commented 1 year ago

If there are any binaries, it runs the resolver against the stripped-down Cargo.toml (using the existing Cargo.lock if it exists) to build a stripped-down Cargo.lock that contains the entries only needed for that package.

Does this only apply if the binaries are additional binaries? Or also if there only one target and it's a binary? Because I don't see the problem for crates that only have a single binary as output. I only see the issue with crates where the binaries are in addition to a lib.

using the existing Cargo.lock if it exists

I'm seeing the problem behavior even when a Cargo.lock exists and doesn't need recreating.

ehuss commented 1 year ago

Does this only apply if the binaries are additional binaries?

It is added if there are any binaries (or example binaries).

Because I don't see the problem for crates that only have a single binary as output.

It shouldn't matter whether or not there is a library. If you can put together a reproduction that exhibits that difference, I can take a look.

I'm seeing the problem behavior even when a Cargo.lock exists and doesn't need recreating.

Cargo doesn't know whether or not it needs updating.