containerbuildsystem / cachi2

GNU General Public License v3.0
5 stars 20 forks source link

Improve merge_syft_sbom.py script to handle rpm purls? #518

Open chmeliik opened 3 months ago

chmeliik commented 3 months ago

When using the merge_syft_sbom.py script with a cachi2 SBOM + a Syft SBOM that contain roughly the same RPMs, no de-duplication occurs. Would it be possible - and a good idea - to de-duplicate?

Example merged SBOMs:

Perhaps the RPMs reported by Syft could be discarded if their NEVRA was also identified by cachi2.

pkg:rpm/centos/{NAME}@{VERSION}-{RELEASE}?arch={ARCH}&epoch={EPOCH}&...

Or maybe the deduplication should also take the vendor into account, so that if two RPMs have the same NEVRA but a different vendor, they would both be kept.

pkg:rpm/{vendor}/...

^ideally, pkg:rpm/fedora%20project from cachi2 should be considered equal to pkg:rpm/fedora from Syft, and same for every vendor which Syft "normalizes" and cachi2 does not. That would require investigating how Syft does the normalization.