rust-lang / cargo

The Rust package manager
https://doc.rust-lang.org/cargo
Apache License 2.0
12.79k stars 2.42k forks source link

`Manifest::dependencies()` always have crates.io SourceIds even if manifest does not #13369

Open nc7s opened 9 months ago

nc7s commented 9 months ago

Problem

A minimal example:

fn main() -> anyhow::Result<()> {
    use cargo::{
        core::{manifest::EitherManifest, SourceId},
        util::toml::read_manifest,
        Config,
    };
    let config = Config::default()?;
    let source_id = match config.get::<Option<String>>("source.crates-io.replace-with") {
        Ok(Some(replacement)) => config
            .get::<Option<String>>(&format!("source.{replacement}.registry"))
            .ok()
            .flatten(),
        _ => None,
    }
    .map(|registry| SourceId::from_url(&registry))
    .unwrap_or_else(|| SourceId::crates_io_maybe_sparse_http(&config))?;

    let (EitherManifest::Real(manifest), _) = read_manifest(
        &std::path::PathBuf::from("Cargo.toml").canonicalize()?,
        source_id,
        &config,
    )?
    else {
        panic!("virtual manifest");
    };
    dbg!(
        manifest.summary().source_id().url().as_str(),
        manifest.dependencies()[0].source_id().url().as_str()
    );
    Ok(())
}

Then, with source replacement in place:

$ cargo run
[src/main.rs:26] manifest.summary().source_id().url().as_str() = "sparse+https://mirrors.sjtug.sjtu.edu.cn/crates.io-index/"
[src/main.rs:26] manifest.dependencies()[0].source_id().url().as_str() = "https://github.com/rust-lang/crates.io-index"

Steps

No response

Possible Solution(s)

No response

Notes

No response

Version

cargo 1.75.0 (1d8b05cdd 2023-11-20)
release: 1.75.0
commit-hash: 1d8b05cdd1287c64467306cf3ca2c8ac60c11eb0
commit-date: 2023-11-20
host: aarch64-apple-darwin
os: Mac OS 14.2.1 [64-bit]
epage commented 9 months ago

So not touched this part too much but I believe source replacement is a feature within the resolver, rather than being a part of loading the manifest.

nc7s commented 9 months ago

So not touched this part too much but I believe source replacement is a feature within the resolver, rather than being a part of loading the manifest.

Sure, that's why I hand-wrote the loading part. But the Manifest is initialized with a replaced SourceId (or non-"standard" for that matter), shouldn't its Dependencys follow that?

weihanglo commented 9 months ago

IIUC, the actual SourceId of a Dependency will get resolved during querying the dependencies, which matches what epage just said.

(In PackageRegistry, query -> ensure_load -> load -> self.source_config.load)

https://github.com/rust-lang/cargo/blob/3e1a2ddc3ea274ec26631bb6922ff4b255b90b08/src/cargo/core/registry.rs#L638-L639

nc7s commented 9 months ago

Though kinda inconsistent? For the Manifest to be (parsed and) initialized a SourceId must be provided, then suddenly Dependencys are "resolved", not to the same provided but something else.

To be consistent, either the Manifest should be "resolved" without user provided SourceId, or the Dependency should inherit that from its originating Manifest.

epage commented 9 months ago

Something to keep in mind is that the primary target of cargo-the-lib is cargo-the-bin. It isn't specifically designed as a general use library and you will likely run into impedance mismatches with how you are wanting to use it, like that.

Generally, we recommend people using cargo metadata (via cargo_metadata crate) to get package data. If you want low-level package information, we provide cargo-util-schemas.

weihanglo commented 9 months ago

Though kinda inconsistent?

I don't have an answer for this. Letting Dependencys get resolved with replaced source implies Manifest will be aware of more settings in .cargo/config.toml, which leads to non-trivial change in Cargo. That being said, the use case of yours might be a good data point of https://github.com/rust-lang/cargo/issues/4614, if you can share what you want to solve with such a library.

nc7s commented 9 months ago

Thanks for the advice on cargo_metadata and friends, will explore using them.

When opening this issue, my thoughts were more on a correctness (and consistency?) perspective. Though with the primary target of "the lib" being "the bin", it's probably not worth it to dig further down.

Letting Dependencys get resolved with replaced source implies Manifest will be aware of more settings in .cargo/config.toml

Unfamiliar with the internals, my assumption that Dependency could be resolved without further reading settings on disk, seems wrong.

the use case of yours might be a good data point of https://github.com/rust-lang/cargo/issues/4614

It's more like a nuance - I use source replacement, a tool I use didn't respect that, I wrote some code for that then found out a side use case still doesn't, and tracing that down led to this issue. Making Manifest standalone doesn't seem useful here.

The tool is debcargo, used to package Rust crates in Debian. Its primary use case, packaging, involves downloading the target crate into registry cache, extracting its .crate, and setting it up to packaging standards. Its inability to follow source replacement was "fixed" by feeding the proper SourceId. The side use case, listing a proper build order for dependencies, requires to download also them. This is where Manifest::dependencies() comes into play and ultimately introduces the reported inconsistency: they are downloaded from replaced source and into its cache, but paths built from Dependencys point to the default, i.e. crates-io, which don't exist.

It seems possible to work around this problem with a few lines to manually set "correct" SourceId for the Dependencys.