Firstyear / obs-service-cargo

OBS Source Service and utilities for Rust software packaging
Mozilla Public License 2.0
15 stars 9 forks source link

Vendor archives should be idempotent between service runs #64

Closed marcosbc closed 4 months ago

marcosbc commented 7 months ago

We are observing that tarballs generated by obs-service-cargo are not idempotent, i.e. the archives are not bit-identical after execution even if there are no changes in the file contents, and therefore their checksums differ. For us, having idempotent archives is useful as a re-execution of the source service with identical file contents would avoid the file being stored again in our repositories.

This also happens when using other compression formats (we have tested xz and gz).

For example, executing this service twice gives different results even if the file contents of the resulting archive are the same:

In this example we were using the tarsum script (you can find it here) to calculate the checksum of each individual files inside the archive. And as you can see, it is identical for both cases so the actual contents of the archive is the same.

Note that in other plugins such as obs-service-node_modules, this does not seem to happen since a re-execution generates bit-identical archives.

uncomfyhalomacro commented 7 months ago

I wonder what caused the difference in checksums. \cc @Firstyear do you have any ideas for what is causing this?

Firstyear commented 7 months ago

@marcosbc This seems like you have a very specific use case here - we don't have the time to investigate such a deep, complex, and frankly trivial issue. In this case I'm sorry to say but "PR's welcome" if you want to investigate and resolve this in this project or more likely, one of the dependencies we rely on.

JanZerebecki commented 4 months ago

You can debug this with diffoscope. Likely you are including mtime, userid, groupid, or something like that, they should be set to 0, (if one likes for mtime git HEAD source commit date can be used instead of 0 if available).

This is necessary for reproducible builds https://reproducible-builds.org/ , and in the future will be mandatory for openSUSE and SUSE distributions.

Firstyear commented 4 months ago

As previously stated, reproducibility has limited and questionable value, so we won't be spending our time on this. PR's welcome.

If such a policy is to be introduced at SUSE/OpenSUSE I would like to know who to contact about it.

Firstyear commented 4 months ago

And of course, as always. http://web.archive.org/web/20230517095028/https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html

Reproducible builds and why you don't need them, from computer security experts.

JanZerebecki commented 4 months ago

I'm the one responsible for reproducible by default policy, you can contact me in this issue tracker and if you tell me what you want to discuss I can suggest a more topical venue.

I thought Tavis Ormandy was writing that as sarcasm, but anyway he retracted it and it was also widely debunked. But lets ignore that for a moment, the archive copy you linked suggests an alternative solution to the security problem: "building the source code" "to get a trusted binary without reproducible builds", which means not sending any intermediary artifact but only the original source, which in the much smaller context of a source service might even work. Will you implement that then?

Firstyear commented 4 months ago

That we shouldn't waste time on reproducible builds that have no value.

Please cite where he retracted it.

I am extremely overworked as is, and have no time to spend on this. PR's welcome.

Firstyear commented 4 months ago

Please also note that if you PR support for reproducible builds, they will not be supported outside "best effort" and if we happen to change something for a new feature or development that breaks them, then we will prioritise the feature/improvement. Reproducibility is at the absolute bottom of our list of concerns.

JanZerebecki commented 4 months ago

The link you provided is 404 https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html but other posts there exist just fine, the burden of proof that Tavis still claims this is on you.

You are saying "reproducible builds that have no value", but you do not even provide an argument of that.

You didn't answer if you would implement the suggested solution in the article you linked, where the article claimed a solution is needed.

Do I understand you correctly in that this project will not provide any supply chain security nor a solution to the security problem of compromised builds?

msirringhaus commented 4 months ago

FWIW, it seems like cargo vendor itself seems to touch all vendored files on each call (also happens with cargo vendor --offline) . This service always calls cargo vendor, I think, so the resulting archives will have a different hash each time. And yes, uid and gid are also set to the script-caller. So yes, I'm guessing this service would need to set all of those to 0, before compressing. I may take a look tomorrow, if this can be easily done.

Firstyear commented 4 months ago

The link you provided is 404 https://blog.cmpxchg8b.com/2020/07/you-dont-need-reproducible-builds.html but other posts there exist just fine, the burden of proof that Tavis still claims this is on you.

Screenshot 2024-02-28 at 09 05 44

We emailed Tavis about this. It's some broken blog links, not a retraction.

Do I understand you correctly in that this project will not provide any supply chain security nor a solution to the security problem of compromised builds?

We do take supply chain security seriously. This is why we implemented things like cargo audit and work with the SUSE product security team to implement meaningful processes for addressing supply chain security issues.

Reproducible builds don't impact either of the topics you have just raised.

I'm not going to keep arguing with you on this matter. If the changes are submitted by a PR, I will review them and accept them, but I will not maintain the behaviour as a guarantee or a feature. Otherwise I will not spend any more of my time on this.

uncomfyhalomacro commented 4 months ago

See commit https://github.com/openSUSE/obs-service-cargo_vendor/commit/eb781fdf792d7f30b149fa6fb29d08f40b1fe7ba