in-toto / apt-transport-in-toto

in-toto transport for apt
Other
8 stars 6 forks source link

Explore possibility of splitting current /sources into /source and /binary #28

Open fepitre opened 3 years ago

fepitre commented 3 years ago

Topic: Explore the possibility of replacing current /sources used by the transport into /source and /binary in order to distinguish between source package name and produced binary package names. It would have the advantage to also be consistent with current Debian snapshot API having start point /mr/package/... (well not source) and /mr/binary/... (see comment https://github.com/in-toto/apt-transport-in-toto/issues/28#issuecomment-758671645). From the user point of view, it helps in understanding what exactly we are looking for and which (built) package. From rebuilder software point of view, it makes referring/linking produced in-toto metadata to all binary packages.

Original description: I'm trying to understand how originally debian-rebuilder-setup was intended to work. From what I can see, a rebuilder is supposed to POST to visualizer (https://salsa.debian.org/reproducible-builds/debian-rebuilder-setup/-/blob/master/builder/srebuild#L454) an in-toto metadata referenced by part of the name of buildinfo file (https://salsa.debian.org/reproducible-builds/debian-rebuilder-setup/-/blob/master/visualizer/accumulator.py#L69), mostly the source package name right? My question is how is supposed to be referenced for example mypackage-dev or any other packages built from a given sources mypackage.buildinfo?

lukpueh commented 3 years ago

cc @SantiagoTorres, @adityasaky, @kpcyrd and @KristelFung, who might be able to help here.

kpcyrd commented 3 years ago

Disclaimer: I haven't worked on the debian-rebuilder-setup since late 2018 and my summary might be out of date.

The visualizer is the public read-only api to fetch results from and the accumulator is the internal api that results are reported to. srebuild runs one build per source package/buildinfo file and generates an in-toto attestation that contains all binary packages built from that source package. I don't know how this integrates with apt-transport-in-toto, maybe somebody familiar with that part of the system can answer this. :)

The debian-rebuilder-setup project staled because debian never defined how a rebuilder is supposed to ingest buildinfo files. I've spent some time thinking about this while working on rebuilderd, the architecture I'd currently suggest is:

1) download the source package index debian publishes 1) use that data to derive a list of all binary packages in debian, grouped by source package 2) do some string transformation on the Directory: field to generate a url for the corresponding buildinfo file on buildinfos.debian.net (the transformation step is unfortunately necessary due to an oversight, but has been reported in #debian-reproducible and people are aware of this) 2) run the build in the environment described in the buildinfo file 3) ignore the checksums in the buildinfo file and instead compare the resulting packages with the packages actually distributed by debian. You may need to download the binary package index or the binary package itself too. 4) generate in-toto metadata on a per-binary-package basis since that's easier to consume for apt

Note that srebuild is being replaced with debrebuild but the merge request to provide feature parity with srebuild is still being worked on and the latest released version of debrebuild can't actually run any builds yet.

fepitre commented 3 years ago

@kpcyrd thank you for your answer that's mostly latest updates you gave me on IRC and also this is why I've started helping 'josch' on debrebuild tool. As you said, this is mostly how the people of apt-transport-in-toto was thinking on how to deal with the original debian approach with debian-rebuilder-setup. As a current workaround and for having some working intoto POC, I've just symlinks https://github.com/fepitre/qubes-rebuilder-setup/blob/master/visualizer/visualizer.py#L149 packages. Ideally I would love to add a database entry of the "package source" instead of creating useless symlinks. This is what motivated my original question here for knowing the ideas behind :)

lukpueh commented 3 years ago

I don't know how this integrates with apt-transport-in-toto, maybe somebody familiar with that part of the system can answer this. :)

As far as apt-transport-in-toto is concerned the rebuilders are only file servers to fetch in-toto attestations (i.e. *.link files) from: https://github.com/in-toto/apt-transport-in-toto/blob/363a110c726f2fa6b0e3ae2da3477ce2cd1f4f40/intoto.py#L577-L579

fepitre commented 3 years ago

@lukpueh yes I think this is clear from the transport even if I would rather distinguish /sources /binaries where /binaries refers to built packages. The original attempt from debian-rebuilder-setup was not matching the case where installing a package with apt, e.g. myawesomepackage1-dev would have no entry where as myawesomepackage would have (assuming this is the source package name). In this case, there were no way for verifying in-toto metadata. So as @kpcyrd said, this is rather a way of how to orchestrate the whole. From my side, I've already had some good results using the current state of the transport!

lukpueh commented 3 years ago

I would rather distinguish /sources /binaries where /binaries refers to built packages...

This sounds reasonable and worth a ticket somewhere.

fepitre commented 3 years ago

@lukpueh would you like me to rename the issue and rework the description or to put it elsewhere in another issue? I don't see this /source(s) /binary(ies) as mandatory but that would be a nice improvement and the ticket would be a place for talking about possibilities and what to do. It would have the advantage to also be consistent with current Debian snapshot API having start point /mr/package/... (well not source) and /mr/binary/....

lukpueh commented 3 years ago

Good idea to reuse this ticket, and yes, I'd appreciate if you could rework issue name and description accordingly.

fepitre commented 3 years ago

@lukpueh I've updated the issue. Any feedback on this possible change is welcomed. Also, I'm completely ok for implementing it! :)

lukpueh commented 3 years ago

Thanks for the update and for volunteering, @fepitre. I'd appreciate if you go ahead and explore/implement this. Do you already have an idea what the split would mean for the transport? As far as I understand, we currently only deal with binary installs. So all we'd need to do is update the endpoint URL. Or would you also add some case handling for apt source installs to use both endpoints.

At any rate, I think we should seize the opportunity and add some documentation about the expected API, maybe to the README, or even better, to a man page, which we should add for other reasons anyway.

fepitre commented 3 years ago

Thanks for the update and for volunteering, @fepitre. I'd appreciate if you go ahead and explore/implement this. Do you already have an idea what the split would mean for the transport? As far as I understand, we currently only deal with binary installs. So all we'd need to do is update the endpoint URL. Or would you also add some case handling for apt source installs to use both endpoints.

For the transport it would mean to identify requested source package and binary packages. For example, in case of RPM, we do have specific package src.rpm which can be downloaded like other binary packages. So in the case of Debian, most of what is wanted on current transport implementation would be to rewrite /sources to /binary. Then, for specific sources download, we could add all source files used for rebuild into in-toto metadata.