grahamc / r13y.com

NixOS Reproducibility Checker
https://r13y.com
MIT License
95 stars 15 forks source link

Hash collection infrastructure #13

Open davidak opened 5 years ago

davidak commented 5 years ago

I wish that this effort is more distributed across more people (see also https://github.com/grahamc/r13y.com/issues/4). To make that possible we need infrastructure to submit package hashes. With that, one can just build one or more packages and submit their hash. In best case, we have multiple hashes for every package.

(It could be an option in Nix to submit the hash of every done build automatically.)

I wish to have the hash most people got for a package (maybe call it consensual hash) visible on https://nixos.org/nixos/packages.html including a single command to build the package yourself and compare the hash (optionally submit)!

I want to discuss the architecture of such an infrastructure in this issue.

First, we need a datastructure to organize the data in. See https://github.com/grahamc/r13y.com/issues/12 for that.

We need some datastore for structured data. I have never developed with a nosql db, but that's probably the right tool for this task.

Then we can have an API to submit and get data. A simple webinterface would be also nice with some stats and a way to nicely view and search the data.

This might be similar to PGP keyservers, like

https://keys.openpgp.org/ https://gitlab.com/hagrid-keyserver/hagrid/

A basic implementation seems very simple, but a good design seems very hard.

Should a submitter be able to add their name, so they get some appreciation for their work or should it be anonymous? If there is a name field, how do we prevent someone to use my name? Maybe it can be solved by using the e-mail address as handle, sign and verify it with PGP? How can someone trust the server to have correct data?

What are attack vectors? Someone could just submit random hashes or submit the hash of a package compiled with malware multiple times, so that hash is more often in the DB than ours. Is web-of-trust a solution for such a problem? Are there other approaches? Here it get's really hard and academic!

But that's a problem other reproducible builds efforts also have, so we can collaborate to find a solution.

davidak commented 3 years ago

This might get solved with Trustix. I don't understand it yet.

https://build-transparency.org/ https://www.tweag.io/blog/2020-12-16-trustix-announcement/

RaitoBezarius commented 1 year ago

May I suggest https://github.com/sigstore/rekor as an alternative?