anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.07k stars 563 forks source link

Clearly document the fact that CPE strings could be made up #2800

Open prabhu opened 5 months ago

prabhu commented 5 months ago

What would you like to be added: Upon reviewing the code, it is clear that the tool is simply making up CPEs as it goes when there is no hit on the cpe-index.json file.

A search on issues for "wrong cpe" or "incorrect package name" shows additional reports across ecosystems.

Why is this needed:

The information about CPE correctness is not clearly communicated to the end user. Many tools, including the official NTIA benchmark tools, are treating these values as correct without enough validation.

Where the confidence of a given identity is low, it must be represented in the resulting SBOM. For example, in CycloneDX use component.evidence.identity with field cpe, appropriate methods.technique and confidence. For SPDX, which lacks support for evidence, an alternative solution may need to be found.

In the long term, consider adopting PURL which has an higher precision and is often easier to construct, compared to CPEs, for multiple ecosystems.

Additional context:

tgerla commented 5 months ago

Hi @prabhu, thanks for the report. We are definitely familiar with the shortcomings of CPE generation and CPE matching and we're interested in including some kind of confidence score when we generate a CPE not found in the index. We have some work to do figuring out the method for scoring.

If you haven't already, please take a look at an SBOM generated in syft-json format. You will see all of the source of the CPEs we generated or were declared, as well as the PURLs for the artifacts found by the Syft scan.

This sounds like a good topic for our community meeting if you are interested in discussing it with the team live -- feel free to join the next one if you like! https://github.com/anchore/syft/?tab=readme-ov-file#join-our-community-meetings

We'll move this issue into backlog but we definitely need to do some more design work before we can implement any solutions. Thanks again!