anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
5.98k stars 551 forks source link

Wrong metadata.component content in CycloneDX #2135

Open Pro opened 11 months ago

Pro commented 11 months ago

Picking up the question from https://github.com/anchore/syft/pull/2131#issuecomment-1719529908:

The CycloneDX contains this metadata when using it on a conan lockfile.

"metadata": {
   "component": {
      "bom-ref": "7ea68ac679dc44fd",
      "type": "file",
      "name": "conan.lock",
      "version": "sha256:sha256:421e3aca902b6310bc89875e13c135843d574a0d01bc399c895fde46ecc16068"
    }
}

This was the comment from @kzantow:

This is the "source" you scanned: image, directory, file, etc.. However, this does not really have dependencies to the other components. Creating a graph in CycloneDX is much more limited than in SPDX -- in SPDX we add CONTAINS relationships for this, but the only option in CycloneDX is dependencies, which isn't accurate and we try very hard not to misuse the formats. I'm curious if you add additional dependency entries from the first entry to the other top-level components, does it look to work properly in DependencyTrack?

Now I looked at the CycloneDX webpage: https://cyclonedx.org/specification/overview/#bom-metadata

BOM metadata includes the supplier, manufacturer, and target component for which the BOM describes. It also includes the tools used to create the BOM, and license information for the BOM document itself.

Based on this, I understand it in the way that metadata.component is not the source from where the Bom was generated, but actually the target for which it is generated.

To me it would be more logical to have here specifically a component which is referenced in the components array further down. This is also how https://github.com/CycloneDX/cyclonedx-conan is using it. And also how it is shown in the examples here: https://github.com/CycloneDX/bom-examples/blob/master/SBOM/dropwizard-1.3.15/bom.json#L47

My Question: Could it be that syft is wrongly setting the metadata.component entry to the conanfile, instead of the specific component?

wagoodman commented 10 months ago

I see the basic point: if I'm generating an sbom for a single lib that has multiple dependencies from a directory scan, its more semantically useful to have the root component be the single lib that was discovered, not the file it was discovered from (which has non semantic meaning). I think changing the default behavior makes sense, however, there will be times when more than one component would be discovered so choosing one package over another might not make sense (in which case the behavior we have today is alright).

Additionally, adding a configurable for this to allow folks to opt out of this "promote the root package" behavior being described also makes sense to me.

Lastly, I think this strategy should be considered for SPDX too, even if we decide to not implement it for SPDX.

Pro commented 10 months ago

@wagoodman for a conan lock file or conanfile.txt, there is always only one root component for which this lockfile is generated. I do not see a good example where such a file may describe multiple root components, and therefore the metadata.component should be something different.

So IMHO, there is no need to have this additional "promote the root package" option, but simply always use the root package in the metadata.component.