rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.58k stars 2.39k forks source link

"unable to get packages from source" due to SourceId hash collision #12233

Open jmpesp opened 1 year ago

jmpesp commented 1 year ago

Problem

Attempting to compile https://github.com/jmpesp/omicron/tree/update_crucible_and_propolis results in

$ cargo build
error: failed to download `internal-dns v0.1.0 (https://github.com/oxidecomputer/omicron?branch=main#bd6c6280)`

Caused by:
  unable to get packages from source

Caused by:
  failed to find internal-dns v0.1.0 (https://github.com/oxidecomputer/omicron?branch=main#bd6c6280) in path source
note: this is an unexpected cargo internal error
note: we would appreciate a bug report: https://github.com/rust-lang/cargo/issues/
note: cargo 1.68.2 (6feb7c9cf 2023-03-26)

Steps

This happens every time I cargo build the linked branch.

Possible Solution(s)

Adding the following trace:

diff --git a/src/cargo/core/package.rs b/src/cargo/core/package.rs
index 40ba9cdf8..d961a347c 100644
--- a/src/cargo/core/package.rs
+++ b/src/cargo/core/package.rs
@@ -688,6 +688,7 @@ impl<'a, 'cfg> Downloads<'a, 'cfg> {
         let source = sources
             .get_mut(id.source_id())
             .ok_or_else(|| internal(format!("couldn't find source for `{}`", id)))?;
+        log::trace!("downloads::start_inner for {:?}: input {:?} selected source {:?}", id, id.source_id(), source.source_id());
         let pkg = source
             .download(id)
             .with_context(|| "unable to get packages from source")?;

output the following related message for internal-dns just before the unexpected cargo error (indenting added by me):

[2023-06-05T21:13:02Z TRACE cargo::core::package] downloads::start_inner

for PackageId { name: "internal-dns", version: "0.1.0", source: "https://github.com/oxidecomputer/omicron?branch=main#bd6c6280" }:

input SourceId { inner: SourceIdInner { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }, canonical_url: CanonicalUrl(Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }), kind: Git(Branch("main")), precise: Some("bd6c62807fbb2ea4fdae0b11c301124936ea41a2"), name: None, alt_registry_key: None } }

selected source SourceId { inner: SourceIdInner { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }, canonical_url: CanonicalUrl(Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }), kind: Git(Branch("main")), precise: Some("7f772b32a0cd02dac075669cb2ece41d1cc1ddf2"), name: None, alt_registry_key: None } }

cargo can't find the internal-dns package in 7f772b32a0cd02dac075669cb2ece41d1cc1ddf2 because it didn't exist there! It was added in a later commit.

Strangely, it's searching a checkout of 7f772b32a0cd02dac075669cb2ece41d1cc1ddf2 for the corresponding package source https://github.com/oxidecomputer/omicron?branch=main#bd6c6280.

sources is backed by the following HashMap:

/// A [`HashMap`] of [`SourceId`] to `Box<Source>`.
#[derive(Default)]
pub struct SourceMap<'src> {
    map: HashMap<SourceId, Box<dyn Source + 'src>>,
}

I think this is caused by the Hash and Eq impls for SourceId not including the precise field:

https://github.com/rust-lang/cargo/blob/383a68e7948520f4ccb355b027ff14fe6a7bf248/src/cargo/core/source/source_id.rs#L499-L500

https://github.com/rust-lang/cargo/blob/383a68e7948520f4ccb355b027ff14fe6a7bf248/src/cargo/core/source/source_id.rs#L592-L600

This would cause inserts of sources with different Omicron revs to clobber each other.

The following test fails:

diff --git a/src/cargo/core/source/source_id.rs b/src/cargo/core/source/source_id.rs
index 4064364d5..cb96b20dc 100644
--- a/src/cargo/core/source/source_id.rs
+++ b/src/cargo/core/source/source_id.rs
@@ -905,6 +905,14 @@ mod tests {
         assert_eq!(formatted, "sparse+https://my-crates.io/");
         assert_eq!(source_id, deserialized);
     }
+
+    #[test]
+    fn precise_comparison() {
+        let left = SourceId::from_url("git+https://github.com/oxidecomputer/omicron?branch=main#bd6c6280").unwrap();
+        let right = SourceId::from_url("git+https://github.com/oxidecomputer/omicron?branch=main#7f772b32").unwrap();
+
+        assert_ne!(left, right);
+    }
 }

 /// Check if `url` equals to the overridden crates.io URL.

with

running 1 test
thread 'core::source::source_id::tests::precise_comparison' panicked at 'assertion failed: `(left != right)`
  left: `SourceId { inner: SourceIdInner { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }, canonical_url: CanonicalUrl(Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }), kind: Git(Branch("main")), precise: Some("bd6c6280"), name: None, alt_registry_key: None } }`,
 right: `SourceId { inner: SourceIdInner { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }, canonical_url: CanonicalUrl(Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("github.com")), port: None, path: "/oxidecomputer/omicron", query: None, fragment: None }), kind: Git(Branch("main")), precise: Some("7f772b32"), name: None, alt_registry_key: None } }`', src/cargo/core/source/source_id.rs:914:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test core::source::source_id::tests::precise_comparison ... FAILED

failures:

failures:
    core::source::source_id::tests::precise_comparison

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 58 filtered out; finished in 0.00s

Notes

No response

Version

$ cargo version --verbose
cargo 1.68.2 (6feb7c9cf 2023-03-26)
release: 1.68.2
commit-hash: 6feb7c9cfc0c5604732dba75e4c3b2dbea38e8d8
commit-date: 2023-03-26
host: x86_64-unknown-linux-gnu
libgit2: 1.5.0 (sys:0.16.0 vendored)
libcurl: 7.86.0-DEV (sys:0.4.59+curl-7.86.0 vendored ssl:OpenSSL/1.1.1q)
os: Ubuntu 22.04 (jammy) [64-bit]
jmpesp commented 1 year ago

My apologies, I've pushed commits to that branch since I opened this issue. The commit that fails to build is https://github.com/jmpesp/omicron/commit/8155cb2ad393a0dc40d887e16abe66d6519e42aa.

weihanglo commented 1 year ago

Thanks for the detailed report and bug tracing! And sorry for the extremely late reply.

Unfortunately I don't have any helpful answer here 😞. Here is another issue https://github.com/rust-lang/cargo/issues/7497 proposing Cargo to include actual commit hash when referencing a Git dependency, but from another angle — they want Cargo knows different refs are actually on the same commit. If that helps.