Open mschusterbsi opened 3 weeks ago
The proposed description has some issues
Every BOM creator SHOULD use a unique serial number when describing a specific component, which MUST stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.
Specifically with:
which MUST stay the same if the BOM is re-generated or the contents of this component have changed
This would require the BOM creator to maintain a database of all the components (first-party and third-party) and ensure they reuse the same serialNumber. This requirement could not be fulfilled by the majority of existing BOM generators, especially those integrated into CI/CD pipelines.
Is the goal of this change to make the serialNumber deterministic?
We understand that there might be technical hurdles to reuse an unique identifier. Though, we are convinced that these can be overcome in most cases.
Our aim is to be make sure that an SBOM creator uses the same serial number (or some other unique identifier for a specific SBOM) for the same primary component. The version field would be incremented for each newly created version. This allows the consumer to correlate different versions of an SBOM from the same SBOM-creator and detect changes between them.
At least we imagine the following wording, as an weaker alternative to our original suggestion (though we are not really happy with it):
Every BOM creator SHOULD use a unique serial number when describing a specific component, which SHOULD stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.
When reading this discussion thread, I think I perceive a few subtle misunderstandings:
Is the goal of this change to make the serialNumber deterministic?
I do not comprehend "deterministic" in this context; "uniquely identifying a specific BOM which describes a certain primary component (WRT this BOM)" and "is likely generated by the same a BOM creator" appears to be the goal here. Simply what UUIDs (Universal Unique IDentifiers) have been invented for: To be able to re-identify the same object, even across non-substantial changes, as version
is used to denote these changes.
This would require the BOM creator to maintain a database of all the components (first-party and third-party) and ensure they reuse the same serialNumber.
This statement made me consider the wording of the proposal as ambiguous (as denoted in the first bullet point of this message), because the point is primarily about BOMs, which for sure always have a primary component a BOM principally describes. For example, imagine a BOM for the RPM package postgres
(the DBMS) packaged by RedHat: Defining serialNumber
as proposed here would allow to unambiguously identify "postgres
by RedHat". Would this require RedHat to maintain a database of all the components postgres
uses (which they sure have)? IMO not, all this information can be accessed at build time. This scheme only requires to maintain a database of the UUIDs used for each primary component a BOM creator generates BOMs for.
This requirement could not be fulfilled by the majority of existing BOM generators, especially those integrated into CI/CD pipelines.
I do not dare to comment on "the majority of existing BOM generators", because I surely do not know them all, but I see no reason why this scheme would not work with "BOM generators […] integrated into CI/CD pipelines": A CI build recipe is fed with information (usually a git repo and a git tag to check out, plus some ancillary information, which could comprise a UUID / serialNumber
to reuse, or the UUID / serialNumber
is simply stored in the git repo of the primary component) and outputs principally the build artifact(s), but may also output additional information, which could comprise a UUID / serialNumber
it used, e.g. as part of an generated SBOM.
Current Behavior
serialNumber
is defined as an UUID and RECOMMENDED:version
is defined as an integer > 0:Proposed Behavior
In our opinion UUIDs and hence the CycloneDX
serialNumber
must be static ("unequivocal in time and space" = "temporally and spatially unique"), as long as an SBOM creator records the same software component, even if these software componets are altered: e.g. new versions, files or sub-components are added or removed, etc.Hence, we propose as the definition of
serialNumber
: Every BOM creator SHOULD use a unique serial number when describing a specific component, which MUST stay the same if the BOM is re-generated or the contents of this component have changed. If specified, the serial number MUST conform to RFC-4122. Use of serial numbers is RECOMMENDED.