anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
5.77k stars 530 forks source link

Add ability to combine multiple SBOMs into a single file #1711

Open pentago opened 1 year ago

pentago commented 1 year ago

What would you like to be added: ability to combine multiple SBOMs into a single file.

Why is this needed: In case of scanning multiple images build as part of a single project, it would be amazing to be able to both generate separate SBOMs and combine them fo easier delivery to targets such as DependencyTrack or any other, without having to rely on additional tooling to do this work. I think it's well within the tool's scope to do combining work as well, especially since it already can generate and convert between formats. I think having the ability to combine multiple SBOMs is a natural progression.

Additional context: N/A

jlehman9 commented 1 year ago

This can be done with cyclonedx-cli if you're using that format, but I agree that built-in support would be nice.

wagoodman commented 11 months ago

Side note: this is mostly a dup of https://github.com/anchore/syft/issues/617 . That being said, there is still context here that is specific to DependencyTrack so we can keep this open here for now.

A good next step to figure for this is how might this be done. I feel like a separate command would be useful:

syft merge SBOM1.spdx.json SBOM2.json -o json --file OUTPUT-SBOM.json

In this way when it comes to either merging source blocks into a list (as described in #617) could have specific considerations and UI helpers. (There are a few paths forward here)

Happy for more thoughts here @pentago

jhoward-lm commented 11 months ago

There is also a tool called Hoppr that can merge CycloneDX SBOMs. It currently supports CDX spec versions 1.3 and 1.4, and support for 1.5 is forthcoming. It attempts to intelligently merge duplicate list items in the input SBOMs rather than aggregating. After installing with pip install hoppr, it can be invoked via the hopctl merge command:

hopctl merge --sbom sbom1.cdx.json --sbom sbom2.cdx.json --output-file hopctl-merged.cdx.json
kzantow commented 11 months ago

@jhoward-lm do you know what this tool does with the metadata component? This has been the main sticking point for implementing a feature like this in Syft, as we have a single source, which has information relevant to exporting to other SBOM formats as well as vuln matching in Grype

jhoward-lm commented 11 months ago

@kzantow The current generic logic for merging all object and primitive types is to give precedence to a value for that field that was previously added to the merge result (based on the specified order of input SBOM sources), otherwise copy the field value into the merged result.

List types are merged based on defined logic for identifying if two objects are duplicates of each other on a type-by-type basis (i.e. components, dependencies, and vulnerabilities have different logic for checking equality/uniqueness).

Full disclosure: I'm the primary developer of hoppr's merge capability 😄 Here's an example using input SBOMs generated with Syft and Trivy for the docker.io/bitnami/mongodb image.

Syft metadata ```json "metadata": { "timestamp": "2023-08-09T11:57:21-05:00", "tools": [ { "vendor": "anchore", "name": "syft", "version": "0.86.1" } ], "component": { "bom-ref": "1a2256ab6440784e", "type": "container", "name": "docker.io/bitnami/mongodb", "version": "sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c" } } ```
Trivy metadata ```json "metadata": { "timestamp": "2023-08-09T16:53:26+00:00", "tools": [ { "vendor": "aquasecurity", "name": "trivy", "version": "0.44.0" } ], "component": { "bom-ref": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb\u0026arch=amd64", "type": "container", "name": "docker.io/bitnami/mongodb", "purl": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb\u0026arch=amd64", "properties": [ { "name": "aquasecurity:trivy:DiffID", "value": "sha256:9deb87cfae77e0fa7fbe6b0aa4c9371f76ce8e3cd2df9541a0795b499dbd30de" }, { "name": "aquasecurity:trivy:ImageID", "value": "sha256:af23b71071064b0be3f700c8d04fdcb438ad4f797dafc89257fde3fd810e1a45" }, { "name": "aquasecurity:trivy:RepoDigest", "value": "bitnami/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a" }, { "name": "aquasecurity:trivy:RepoTag", "value": "bitnami/mongodb:latest" }, { "name": "aquasecurity:trivy:SchemaVersion", "value": "2" } ] } } ```

With hopctl merge --sbom trivy.cdx.json --sbom syft.cdx.json:

Merged metadata ```json "metadata": { "timestamp": "2023-08-09T16:53:26+00:00", "tools": [ { "vendor": "aquasecurity", "name": "trivy", "version": "0.44.0" }, { "vendor": "anchore", "name": "syft", "version": "0.86.1" } ], "component": { "type": "container", "bom-ref": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb&arch=amd64", "name": "docker.io/bitnami/mongodb", "version": "sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "scope": "required", "hashes": [], "licenses": [], "purl": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb&arch=amd64", "externalReferences": [], "components": [], "properties": [ { "name": "aquasecurity:trivy:DiffID", "value": "sha256:9deb87cfae77e0fa7fbe6b0aa4c9371f76ce8e3cd2df9541a0795b499dbd30de" }, { "name": "aquasecurity:trivy:ImageID", "value": "sha256:af23b71071064b0be3f700c8d04fdcb438ad4f797dafc89257fde3fd810e1a45" }, { "name": "aquasecurity:trivy:RepoDigest", "value": "bitnami/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a" }, { "name": "aquasecurity:trivy:RepoTag", "value": "bitnami/mongodb:latest" }, { "name": "aquasecurity:trivy:SchemaVersion", "value": "2" } ] } } ```

With hopctl merge --sbom syft.cdx.json --sbom trivy.cdx.json:

Merged metadata ```json "metadata": { "timestamp": "2023-08-09T11:57:21-05:00", "tools": [ { "vendor": "anchore", "name": "syft", "version": "0.86.1" }, { "vendor": "aquasecurity", "name": "trivy", "version": "0.44.0" } ], "component": { "type": "container", "bom-ref": "docker.io/bitnami/mongodb@sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "name": "docker.io/bitnami/mongodb", "version": "sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "scope": "required", "hashes": [], "licenses": [], "purl": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb&arch=amd64", "externalReferences": [], "components": [], "properties": [ { "name": "aquasecurity:trivy:DiffID", "value": "sha256:9deb87cfae77e0fa7fbe6b0aa4c9371f76ce8e3cd2df9541a0795b499dbd30de" }, { "name": "aquasecurity:trivy:ImageID", "value": "sha256:af23b71071064b0be3f700c8d04fdcb438ad4f797dafc89257fde3fd810e1a45" }, { "name": "aquasecurity:trivy:RepoDigest", "value": "bitnami/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a" }, { "name": "aquasecurity:trivy:RepoTag", "value": "bitnami/mongodb:latest" }, { "name": "aquasecurity:trivy:SchemaVersion", "value": "2" } ] } } ```

The bom-ref generated by syft does get overwritten with its <component name>@<component value> combo for hoppr's purposes, however. It helps with identifying duplicate components in multiple input SBOMs to be merged into a single resulting component.

This gave me an idea for hoppr, though. Maybe hoppr (and your use case too if this wouldn't misrepresent the relationship) could set the metadata top-level component in the merged output to the result it currently generates, and add the unmodified metadata components from all input SBOMs to its nested components list with a scope of excluded, e.g.

Merged metadata ```json "metadata": { "timestamp": "2023-08-09T11:57:21-05:00", "tools": [ { "name": "hoppr", "version": "1.9.2" }, { "vendor": "anchore", "name": "syft", "version": "0.86.1" }, { "vendor": "aquasecurity", "name": "trivy", "version": "0.44.0" } ], "component": { "type": "container", "bom-ref": "docker.io/bitnami/mongodb@sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "name": "docker.io/bitnami/mongodb", "version": "sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "scope": "required", "hashes": [], "licenses": [], "purl": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb&arch=amd64", "externalReferences": [], "components": [ { "bom-ref": "1a2256ab6440784e", "type": "container", "name": "docker.io/bitnami/mongodb", "version": "sha256:259f8240b1574b148ddebb75e7a3f54b93a058ad071833cc94d43859dd2ff04c", "scope": "excluded" }, { "bom-ref": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb\u0026arch=amd64", "type": "container", "name": "docker.io/bitnami/mongodb", "purl": "pkg:oci/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a?repository_url=index.docker.io%2Fbitnami%2Fmongodb\u0026arch=amd64", "properties": [ { "name": "aquasecurity:trivy:DiffID", "value": "sha256:9deb87cfae77e0fa7fbe6b0aa4c9371f76ce8e3cd2df9541a0795b499dbd30de" }, { "name": "aquasecurity:trivy:ImageID", "value": "sha256:af23b71071064b0be3f700c8d04fdcb438ad4f797dafc89257fde3fd810e1a45" }, { "name": "aquasecurity:trivy:RepoDigest", "value": "bitnami/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a" }, { "name": "aquasecurity:trivy:RepoTag", "value": "bitnami/mongodb:latest" }, { "name": "aquasecurity:trivy:SchemaVersion", "value": "2" } ], "scope": "excluded" } ], "properties": [ { "name": "aquasecurity:trivy:DiffID", "value": "sha256:9deb87cfae77e0fa7fbe6b0aa4c9371f76ce8e3cd2df9541a0795b499dbd30de" }, { "name": "aquasecurity:trivy:ImageID", "value": "sha256:af23b71071064b0be3f700c8d04fdcb438ad4f797dafc89257fde3fd810e1a45" }, { "name": "aquasecurity:trivy:RepoDigest", "value": "bitnami/mongodb@sha256:3b71d20f3f821e5a893207f4c25053ae1e7b5db6065bcbc2d9edf25f8856b42a" }, { "name": "aquasecurity:trivy:RepoTag", "value": "bitnami/mongodb:latest" }, { "name": "aquasecurity:trivy:SchemaVersion", "value": "2" } ] } } ```

Another option might be to leverage the pedigree property.