Open Gaelan opened 3 years ago
Omit the Cargo.lock from
.crate
files for crates that only contain libraries.
This is already how Cargo works. The lock is only included if the package has a binary or example. I suspect the projects you looked at might have had some examples?
Aha, you're right, rand_core.crate
, which has no examples, is reproducible. (Interestingly, libc.crate
also doesn't have examples, but isn't reproducible because vxworks/mod.rs
is marked as executable in the git repository but not the official crate file. Weird.)
That still leaves the question of how to handle this. The current practice of shipping non-version-controlled Cargo.lock
files for libraries with examples isn't great; you might get lucky and get something meaningful if the developer did their packaging from the same checkout they'd been working in, but that goes out the window if the packaging is done on another machine (or CI). I think the options now are:
Cargo.lock
Cargo.lock
, even if examples are present, if it is listed in .gitignore
(Apologies if I've filed this in the wrong place; happy to move it wherever it makes sense.)
There's been some interest lately in ensuring that code uploaded to crates.io is the same as the code in the repository on GitHub.
Aside on why this is worth doing
Most people who are interested in a crate's code will look at GitHub (or similar) repository, not the code uploaded to crates.io (citation needed, but I know I do this and I assume most others do too). This means that the code that's actually running is looked at by comparatively fewer people, and authors of malicious crates can make their vulnerabilities less likely to be discovered. This has happened in practice with [the `event-stream` NPM package](https://cnorthwood.medium.com/todays-javascript-trash-fire-and-pile-on-f3efcf8ac8c7). By comparing the published crate to the GitHub source, this ensures that any malicious code must be visible when people go looking for it.This doesn't need to be done by cargo, of course, but Cargo's current method of generating crate files makes it difficult for any tool to do this.
Ideally, .crate files would be bit-for-bit reproducible. If that were the case, this would be as simple as downloading the
.crate
file, cloning the source, runningcargo package
, and comparing hashes. #8864 made it most of the way there, but it fails in practice (with at least the crates I tested, the latest versions ofhyper
andrand
), because the Cargo.lock files in the uploaded crate differ from the newly generated one. The crates follow the official guidance to omit the file (because they're libraries), so my Cargo generates a new one on the fly, including any new versions of dependencies since the crate was uploaded. Therefore, there's a mismatch.I see a few solutions here:
.crate
files for crates that only contain libraries. I assume this file is never actually read unless the crate in question was passed directly tocargo install
? If so, this seems like the best way forward.