CycloneDX / specification

OWASP CycloneDX is a full-stack Bill of Materials (BOM) standard that provides advanced supply chain capabilities for cyber risk reduction. SBOM, SaaSBOM, HBOM, AI/ML-BOM, CBOM, OBOM, MBOM, VDR, and VEX
https://cyclonedx.org/
Apache License 2.0
362 stars 57 forks source link

Add way to indicate that a component is missing the cryptographic hash #262

Open chmeliik opened 1 year ago

chmeliik commented 1 year ago

Motivation

As a CycloneDX consumer, I would like the ability to validate whether all the components declared their expected cryptographic hash. In SLSA v0.1, for example, hashes are recommended for hermetic builds (all dependencies must be declared with immutable references).

CycloneDX components have the hashes attribute. However, an empty hashes array does not necessarily mean that the component is missing a hash - perhaps the component cannot have, and does not need, a hash.

For example:

A Go project can be split into multiple modules which can depend on each other. Such a dependency is expressed via local replacements: replace my.org/my-project/api => ./api. A locally replaced module will not have a hash in go.sum, nor should it need one (it is version-controlled along with the module which depends on it).

Many package managers allow the user to depend directly on a git repository. Such dependencies also do not have cryptographic hashes, they instead rely on the git commit hash for integrity.

Proposal

Add a Component attribute which would let CycloneDX producers indicate that the hash is, in fact, missing. Or, to look at it the other way, indicate that empty hashes are OK.

For example:

{
  "hashes": [],
  "missingHash": false
}

missingHash would be true if the component can declare the expected hash but doesn't. For example, someone forgot to update go.sum after adding a dependency.

stevespringett commented 1 year ago

The only components that I'm aware of that cannot have a hash would be hardware devices and the like. Remote components on version control can absolutely contain a hash.

Many package managers allow the user to depend directly on a git repository. Such dependencies also do not have cryptographic hashes, they instead rely on the git commit hash for integrity.

In this case, you would end up with nested components, specifically a component (remote) that represents the repo/commit to fetch, and child components that represent all the individual files that were fetched from the repo. IMO, the SHA1 hash for the top-level component would be the commit hash.

BTW, the CycloneDX method of describing completeness is via compositions. Should we want to proceed with proposal, extending compositions to handle fields within a component would be the way this could be achieved.

chmeliik commented 1 year ago

Given, for example, a git dependency in requirements.txt such as this one

osbs-client @ git+https://github.com/containerbuildsystem/osbs-client@2bd03f4e0e5edc474b6236c5c128620d988f79a3

Would this be an incorrect way to report it in CycloneDX?

{
  "type": "library",
  "name": "osbs-client",
  "purl": "pkg:pypi/osbs-client?vcs_url=git+https://github.com/containerbuildsystem/osbs-client@2bd03f4e0e5edc474b6236c5c128620d988f79a3"
}

If the above is acceptable, I suppose that such components should be reported as having a SHA-1 hash?

  "hashes": [
    {"alg": "SHA-1", "content": "2bd03f4e0e5edc474b6236c5c128620d988f79a3"}
  ]

As another example, let's take go.etcd.io/etcd/v3, which depends on go.etcd.io/etcd/api/v3

replace (
    go.etcd.io/etcd/api/v3 => ./api

Let's say I'm processing the source repository (https://github.com/etcd-io/etcd/tree/v3.5.9, at the v3.5.9 tag for simplicity) and generating CycloneDX for it. When reporting the local go.etcd.io/etcd/api/v3 component, a reasonable solution could be to take the commit hash I'm processing and report it as a SHA-1 hash?

{
  "name": "go.etcd.io/etcd/api/v3",
  "version": "v3.5.9",
  "purl": "pkg:golang/go.etcd.io/etcd/api/v3@v3.5.9",
  "hashes": [
    {"alg": "SHA-1", "content": "bdbbde998b7ed434b23676530d10dbd601c4a7c0"},
  ]
}
chmeliik commented 1 year ago

@stevespringett what did you have in mind with the compositions?

Would you be able to provide an example of what that would look like?

mmarseu commented 8 hours ago

BTW, the CycloneDX method of describing completeness is via compositions. Should we want to proceed with proposal, extending compositions to handle fields within a component would be the way this could be achieved.

@stevespringett Isn't there another solution? In dependencies the spec declares that an empty list signifies the absence of dependencies. It goes on to recommend the additional use of compositions - I guess to be able to communicate the slightly different scenarios "there are no dependencies that we know about" and "there are no dependencies, period.".

Such a change would we backwards-incompatible but it could be something to keep in mind for 2.0, no? Or in more generalized terms: The meaning of empty lists could maybe be more consistent across CycloneDX.