Open tkopecek opened 2 months ago
I assume by buildroot RPMs we're talking about all of the compilers, libraries, and other tools using during the process that are installed from specific RPM. I think it makes sense to list these out and relate them all to the SRPM package in the SBOM using the BUILD_TOOL_OF
relationship type. The SRPM is the source package that defines how all the binary RPMs are built, and the relationship to the buildroot RPMs would then indicate that these RPMs are tools needed to turn that SRPM into the final binary packages.
We also use the BUILD_TOOL_OF
relationship type to describe container images used during a multi-stage build (see https://github.com/RedHatProductSecurity/security-data-guidelines/pull/23).
What about architecture differences? If I've one SRPM it would mean totally different buildroots for each of the architectures. Putting them all into one BUILD_TOOL_OF dependency list will create a container which is a) not reproducible as these rpms can't be installled in the same environment b) hiding the real arch differences.
True true, that would be a problem. The other issue we discussed yesterday was that if we were to produce one manifest (file) per architecture and list out only the arch-specific buildroot deps for the given RPMs, if the list of RPMs included a noarch package then it would have to specify the random arch that was used to build that package, so potentially a different set of build tools for different RPMs.
@tkopecek this might be a stupid question, but is there a reason we can't rebuild the noarch package for each arch so that its buildroot matches that of all the other arch-specific RPMs? I assume it's done to save resources to not have to build the same package N number of times (N = # of arches).
I think the only way to make the mapping of binary RPMs to their buildroot is to link each of them to the specific buildroot RPMs, which as you say is a very large matrix of components... Producing separate manifests for the buildroot itself if also an option, but linking to those documents (using SPDX-Refs) is not something any tool understands today.
One other option would be to create a package object that represents the buildroot itself, something like:
{
"SPDXID": "SPDXRef-Package-Buildroot-x86-64",
"name": "example-buildroot-x86-64",
"versionInfo": "<some hash? do these envs have unique identifiers?>",
"downloadLocation": "NOASSERTION",
"filesAnalyzed": false
},
and then associate buildroot RPMs with that object, and create relationships of e.g. SPDXRef-Package-Buildroot-x86-64
to all of the binary RPMs.
Again though, I'm not sure this would be correctly interpreted by any tools parsing such a document...
@tkopecek this might be a stupid question, but is there a reason we can't rebuild the noarch package for each arch so that its buildroot matches that of all the other arch-specific RPMs? I assume it's done to save resources to not have to build the same package N number of times (N = # of arches).
We can (in reality we do that) but we distribute just one of them. It is more a practical issue than a requirement. There is a rightful expectation that noarch package should be same no matter the underlying arch. Koji will build noarch subpackage on all the arches and then compare them that they really have same content. If not, build fails. If yes, one is picked semi-randomly and used as part of the build. If we've distributed more noarch packages with same filename but different checksums (which they'll always have) it would create a lot of confusion and it is also wasting bandwith/storage - there are also of course a lot of related issues, but these are the basic ones.
Example - libreoffice build in fedora: Here you can see list of noarch subpackages - each of them is listed only once in the final build But related build tasks x86_64 and aarch64 contains different versions of these. Only in the end it is consolidated to one build with NEVRA uniqueness.
Ok, so for the purposes of the SBOM, would it not be correct to create a separate SBOM for each architecture and include the noarch in the list as being built in the same env as all of the other arch-specific packages?
It could be, but that wouldn't be 100% true. As noarch would be generated in "compatible" not yet the exactly same environment.
But if all noarch packages built in all environments are compared and are considered the same, then wouldn't the build env of a package built on x86-64 be equivalent to that built on s390x? I.e. listing the builds deps of the s390x env for a noarch package built in x86-64 would be the same, no?
It is equivalent but not the same. We are able to reconstruct said noarch, but it would be from potentially different content. It can introduce e.g. some false positives. Imagine, that x86_64 dependency has some bundled library (which is not used to build said noarch) which has CVE. We can build noarch also on s390x which doesn't contain such library. Do we need to rebuild the noarch or not? I don't think it is a big issue but I'm leaning to write to SBOM the real buildroot not the compatible one to avoid potential issues in the future.
So, let's leave srpm as an artifact without dependencies and do a matrix of deps for all produced binary/noarch rpms. In the future we can potentially refer to external sbom referencing whole buildroot instead of putting everything into the one file which could be huge. In this point there is no tooling support for referenced files.
I''m gathering all rpms used in buildroot and adding them as BUILD_DEPENDENCY_OF of every rpm produced in build architecture. It is a vast matrix for some rpms. Is it the right way? Just to illustrate my thinking: 1) They should be build deps only of srpm (which is not technically right) 2) They should be some other relation type. 3) Some build processes differ between srpm and rpm builds (koji) while others do everything in same buildroot (konflux). It would result in very different SBOMs which is probably fine.