CycloneDX / specification

OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. SBOM, SaaSBOM, HBOM, AI/ML-BOM, CBOM, OBOM, MBOM, VDR, and VEX
https://cyclonedx.org/
Apache License 2.0
338 stars 57 forks source link

[Discussion] Findings on Discrepancy Assessments within the SBOM Ecosystem. #433

Closed dw763j closed 2 months ago

dw763j commented 2 months ago

Assessments results on discrepancy of SBOM ecosystem and some suggestions

Background

As SBOM can be widely used in software software chain management, the capability and issues within SBOM ecosystem can influence the employment of users, thus accurately assessments of the current SBOM state is important. To this end, we have conducted a series of assessments on key characteristics in SBOM applications to reveal the potential discrepancies hindering usage.

Questions

We asked 3 questions: 1. Compliance: Do SBOM tools generate outputs that adhere to user requirements and standards? 2. Consistency: Do SBOM tools maintain consistency in transforming the produced SBOM? 3. Accuracy: How accurate are the SBOM produced by tools in reflecting the objective software?

Upon 9970 SBOM documents generated from 6 SBOM tools (sbom-tool, ort, syft, gh-sbom, cdxgen and scancode) in both SPDX and CycloneDX on 1162 GitHub repositories, we assess these questions.

Results

This table shows average results across all the 6 tools, results are all in package level. Note that in the results for information of software itself is quite poor, for instance, we have 89.59% repositories contain licenses while only a minority are identified.

Attr. pkg_name version author purl license copyright
Compliance 79.61% 74.99% 17.84% 67.53% 32.34% 14.17%
Consistency 18.44% 22.24% 0.11% 24.99% 2.12% -
Accuracy 25.81% 10.66% 4.94% - 10.66% -

The findings indicate that while SBOM tools 100% support mandatory standards requirements, their performance in user case support is at 49.37% and the consistency within these supported use cases is on average of 17.63%. Accuracy assessments reveal significant discrepancies, with accuracy rates of 8.62%, 25.81%, and 12.3% across three defined layers, underscoring substantial areas for improvement within the SBOM ecosystem.

Suggestions

  1. In component sections, some tools record the package name with their information sources like pip, maven, npm, etc., while others do not. In version tools varing in recording like whether add a 'V' before the version string this will lead to problems in utilizing SBOM from different SBOM tools. We suggest to require tools to specify their pattern in recoring information without the standard's explicit specification.
  2. The meaning of NOASSERTION ,NONE and Nonecould be confusing in specific data fields. For instance, version can naturally be empty in packages as the developers didn't record them in the software, tools deal empty ones into empty string or the three forms, which could lead to inconsistency for further exchange. We suggest to provide specific marks for these natually empty data fields.
  3. For hashes, we found that in different tools that using the same hash algorithm on the same single file have different checksums in SPDX, there is even no consistent checksums across all the software and packages. While in CycloneDX, the hashes even does not specify the object the hash is performed on. We suggest to demand tools in creating checksums explicitly illustrate their process for creating the checksums, e.g. salt value or other preprocessing.

We hope our findings can help promote the SBOM ecosystem, any questions or discussions are welcomed.