stackabletech / issues

This repository is only for issues that concern multiple repositories or don't fit into any specific repository
2 stars 0 forks source link

Improve SBOMs #580

Open lfrancke opened 6 months ago

lfrancke commented 6 months ago
### Tasks
- [ ] Add Pedigree information to our patched products
- [ ] Verify that product SBOMs are correct in that they e.g. list hadoop as the product and not hadoop-common or similar
- [x] Provide the dependency tree in the SBOM instead of a flat list of dependencies
dervoeti commented 5 months ago

Just as a note for later: In general, I think the "correct" way to create an SBOM would be to create it during the build, not afterwards (as we do right now with Syft). Some problems might arise when doing this though. For example, Hbase pulls in jackson-databind:2.4.0 via htrace-core4 which seems to be an uber JAR containing jackson-databind:2.4.0 as shaded dependency. I can't find information about the jackson-databind:2.4.0 dependency with Maven (via dependency:tree or cyclonedx-maven-plugin), while tools that analyze the built artifact (like Syft or ScanCode) are able to detect it. So the "reverse analyzed SBOM" might still be the most pragmatic solution. Or we find a way to obtain that information before/during the build.

lfrancke commented 5 months ago

This particular bit of information just gets lost and never recorded anywhere during building and publishing to Nexus. I'm afraid the best option (as stupid as it sounds) is to do the scan afterwards. It at least recovers some information and I believe we have to live with the fact that perfect SBOMs for the Java ecosystem will be impossible/hard to do for the forseeable future.

dervoeti commented 3 months ago

Update: Our solution for this will likely be to merge the information of both approaches. We use a custom tool to merge the information during the build (generated with tools like cyclonedx-maven-plugin) with the information obtained from the scan of the built artifact (generated by Syft). The extra information we gain from this, compared to our current SBOMs, is the whole dependency tree plus maybe some components that can not be detected after the build. This issue in progress as part of https://github.com/stackabletech/issues/issues/614, since these improved SBOMs are the way we want to get this extra information into SecObserve.