mozilla / cargo-vet

supply-chain security for Rust
Apache License 2.0
649 stars 43 forks source link

Regenerating imports while not connected to the internet #494

Closed Guswall closed 1 year ago

Guswall commented 1 year ago

Hi! I have an issue when trying to regenerate imports. I get two error messages; 1: A failure to find the cargo registry and 2: A DNS error from trying to download a crate from crates.io

We work offline with our own crates registry specified to replace crates-io in our cargo config file.

I don’t have a thorough understanding of cargo vet, but from quickly looking through the code, it seems like in cases of missing files in the cache, cargo vet uses crates.io as a hardcoded url to download crates from. We generally use the locked/frozen flags when using cargo vet, but those flags are not available when regenerating imports. Is there some way to specify the registry cargo vet should use when regenerating imports, or using the local cargo cache instead of trying to download any missing crates?

I’ve tried to fetch the crates vet wants to download, but to no avail. It might be that the bigger issue is the fact that the cargo registry was not found and that the crates were not found in cache, but I was not able to glean the problem from the error message.

Any help is appreciated!

mystor commented 1 year ago

cargo-vet always attempts to look things up on the real crates.io, including when downloading crates and reading metadata, no matter what you have configured as a replacement in your config file. This is done in order to ensure that the crates you are auditing are the real crates, to keep audits shareable and relevant to everyone, rather than just for your custom registry.

Off the top of my head, I don't think that cargo vet regenerate imports should be actually trying to download crates.io sources, instead it seems more likely to me that it's trying to hit the crates.io API in order to fetch publisher information for imported wildcard audits and trust entries. The URLs which would be failing to load in that case would look something like https://crates.io/api/v1/crates/cargo-vet, whereas a download would have a URL more like https://crates.io/api/v1/crates/cargo-vet/1.0.0/download, and should generally only be happening when trying to compute audit suggestions.

cargo-vet is not designed to work against any registry other than the real crates.io, in order to keep audits shareable, so the easiest way forward is probably to use it when connected to the internet for local modifications, adding audits, and updating imports, while continuing to run with --locked on CI which doesn't require any internet access. Some features, like wildcard audits and trusted entries, inherently require internet access when being updated.

Guswall commented 1 year ago

Off the top of my head, I don't think that cargo vet regenerate imports should be actually trying to download crates.io sources, instead it seems more likely to me that it's trying to hit the crates.io API in order to fetch publisher information for imported wildcard audits and trust entries.

Ah, that makkes sense.

Some features, like wildcard audits and trusted entries, inherently require internet access when being updated.

We have some ideas for getting around this, and potentially just manually editing the file we are trying to import from.

Thanks for your help!