nexB / dejacode

Automate open source license compliance and ensure software supply chain integrity
https://dejacode.readthedocs.io
GNU Affero General Public License v3.0
18 stars 7 forks source link

Enhancement request: Retain transitive relationship between packages when importing SBOM #122

Open ghsa-retrieval opened 1 month ago

ghsa-retrieval commented 1 month ago

Is your enhancement request related to a problem? Please describe. DejaCode allows to load packages associated with a product from an SBOM. A modern SBOM that fulfills requirements such as BSI TR-03183 will have to list transitive dependencies, meaning that not just the immediate dependencies used by the product are included, but also the ones they depend on and so on until all indirect dependencies are listed as well. When DejaCode imports the packages from the SBOM it only associates them directly with the product, all information about the transitive nature between packages is completely lost. The SBOM exported from DejaCode will list all dependencies as if they were direct dependencies of the product. As such it is not an accurate SBOM and does not fulfill current regulations.

What are the benefits of the requested enhancement? The exported SBOM would provide accurate information. Currently, it does not reflect the actual status of the dependencies and as such is not an SBOM that fulfills regulation such as the Cyber Resilience Act or NTIA Minimum Elements for a Software Bill of Materials persuant to Executive Order 14028 as well as other national standards/technical guidelines.

Describe the solution you would like It would be highly important that DejaCode retains the relationships between packages by storing the hierarchy in the context of the product. The exported SBOM for the product should retain the hierarchy that was provided when importing the SBOM.

The indirect dependencies for a package may seem fixed for a particular package version that is a direct dependency. However, there are uses cases where some of the transitive dependencies could get intentionally excluded, especially for Java software. Thus my recommendation would be to have this mapping in the product and not in the package itself.

Additional notes Example data for an SBOM imported into DejaCode and the result after exporting it. Notice that dependencies has turned into a flat list.

Imported SBOM:

Exported SBOM:

ghsa-retrieval commented 1 month ago

This is currently a major issue for me adopting DejaCode for real projects. I'd be willing to contribute / collaborate on this enhancement, but I'm currently unfamiliar with the codebase and this will likely require a significant design decision.

pombredanne commented 1 month ago

@ghsa-retrieval Thanks for the detailed report!

We have a few pending and in progress issues on the topic:

See also https://github.com/nexB/dependency-inspector/issues/2

Side note: While I personally feel that transitive dependencies as something not essential to SBOMs, I also recognize that this is useful to understand which package introduced a dependency to help with remediation. Beyond remediation, the fact a dependency is direct or indirect does not change anything wrt. licensing or security implications. I think this is a mistake to have made trees a requirement in upcoming regulatory requirements as they are far from essential, and will eventually be missing in VEX statements anyway.

pombredanne commented 1 month ago

Just to be clear: I mean that including in an SBOM all bundled packages is a must! Reporting the way they relate to each other in a dependency tree should have been a nice to have, optional inclusion as this brings only limited value.

ghsa-retrieval commented 1 month ago

@pombredanne It's great to hear that you are already working on this. Sorry for missing the existing tickets, we can close this one if you think this is a complete duplicate.

Including the hierarchy in the SBOM allows for filtering and, as you wrote, helping with remediation. It helps to identify whether the vendor is the one responsible for fixing the issue, because they are using a vulnerable dependency that they should update or if the vendor needs to wait for the developer of one of their transitive dependencies to fix it.

How vulnerabilities should be reported still seems to be open to discussions. As far as I'm aware VEX is not yet mandated and there are different competing or complementing approaches, e.g. CSAF. I'm not aware that VEX would flatten the tree for the dependencies though? It should just be additional information that references packages, either as a separate file or part of the SBOM.

pombredanne commented 1 month ago

Including the hierarchy in the SBOM allows for filtering and, as you wrote, helping with remediation. It helps to identify whether the vendor is the one responsible for fixing the issue, because they are using a vulnerable dependency that they should update or if the vendor needs to wait for the developer of one of their transitive dependencies to fix it

It helps, but is not essential. BTW the BSI link you posted https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/TechGuidelines/TR03183/BSI-TR-03183-2.pdf?__blob=publicationFile&v=5 is awesome. I love that it references ScanCode licenses DB. But I am puzzled by their guidelines about reporting dependencies. For instance:

The full description and recursive resolution of components and their dependencies is performed on each path at least up to and including the first component, which is outside the scope of delivery.

When I read this, I wonder how anyone would practically figure out reporting a partial dependency tree, and what benefit there is to include packages that are not used in a product, "outside the scope of delivery.".

I need to write a post on the dependencies topic!

You also wrote:

How vulnerabilities should be reported still seems to be open to discussions. As far as I'm aware VEX is not yet mandated and there are different competing or complementing approaches, e.g. CSAF. I'm not aware that VEX would flatten the tree for the dependencies though? It should just be additional information that references packages, either as a separate file or part of the SBOM.

For plain exploitability (say CSAF or OpenVEX), there would be no tree as only vulnerabilities are reported AFAIK.

ghsa-retrieval commented 1 month ago

It helps, but is not essential. BTW the BSI link you posted https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/TechGuidelines/TR03183/BSI-TR-03183-2.pdf?__blob=publicationFile&v=5 is awesome. I love that it references ScanCode licenses DB. But I am puzzled by their guidelines about reporting dependencies. For instance:

I suppose it would not have been strictly necessary, but I think the intention is to have a better overview of all dependencies and supply chain risks for both the producer and consumer of the SBOM.

The full description and recursive resolution of components and their dependencies is performed on each path at least up to and including the first component, which is outside the scope of delivery.

When I read this, I wonder how anyone would practically figure out reporting a partial dependency tree, and what benefit there is to include packages that are not used in a product, "outside the scope of delivery.".

I suppose the practical partial dependency trees would be filtered by depth (like the "n-level SBOM" or "transitive SBOM" in the BSI technical guideline). From my under standing the statement about the "Delivery item SBOM" refers to dependencies that users may have to install themselves. So for instance you ship the software, but you require them to install another package or library that is need for the software to run. In that case there is a dependency, but the dependency itself is not part of the delivered software. This wouldn't be relevant if you ship everything needed to run anyway, because then you would have included it in the SBOM already and the "Delivery item SBOM" would be equivalent to the "Complete SBOM".

For plain exploitability (say CSAF or OpenVEX), there would be no tree as only vulnerabilities are reported AFAIK.

Yes, but that would be additional information to a given SBOM. The SBOM itself would not lose its hierarchy.