Open jamietanna opened 1 year ago
did you experience the same issue when generating the SBOM via official tooling https://github.com/CycloneDX/cyclonedx-node-npm ?
@bdehamer see my earlier remarks related to intrinsic impossible deduplication in node_modules
: https://github.com/npm/rfcs/pull/714#issuecomment-1672927160
@jamietanna I'm digging into this issue and considering a couple different solutions. I'd be curious to hear which of these best meets the need of your SBOM use cases . . .
In certain circumstances, it is not possible for npm to completely deduplicate packages in the node_modules
tree. A basic example would be something like this:
demo-package@0.0.1
├─┬ foo@0.0.1
│ └── tslib@1.14.1
├─┬ bar@0.0.1
│ └── tslib@1.14.1
└── tslib@2.6.2
My demo-package
project has dependencies on foo
, bar
and tslib
(version 2.6.2). Since foo
and bar
each have a dependency on an older version of tslib
(version 1.14.1) that is in conflict with the version needed by the root project, tslib@1.14.1
cannot be hoisted to top of the node_modules
and ends-up being duplicated under both foo
and bar
.
Since version 1.14.1 of tslib
literally appears on-disk at two different locations in the tree, the somewhat naive SBOM generator ends-up adding two identical entries to the CycloneDX components
list.
This is why the resulting SBOM fails validation -- we end up with multiple entries which have identical bom-ref
values.
One way to address this would be to treat each package that appears in the tree as a distinct dependency -- even if it is technically identical to some other dependency already present in the tree.
Given the example above, this solution would result in tslib@1.14.1
being listed twice in the SBOM, albeit with a distinct bom-ref
value. We might choose to do something like prefix the bom-ref
name the parent package name resulting in entries that look something like:
[
{
"bom-ref": "foo@0.0.1-tslib@1.14.1",
"type": "library",
"name": "tslib",
"version": "1.14.1",
},
{
"bom-ref": "bar@0.0.1-tslib@1.14.1",
"type": "library",
"name": "tslib",
"version": "1.14.1",
}
]
I believe that this is similar to the how cyclonedx-node-npm solves this problem.
The other approach would be to deduplicate that packages before adding them to the SBOM. Instead of literally mirroring the layout of packages in the node_modules
directory, this solution would detect the multiple instances of tslib@1.14.1
and fold them into a single entry in the SBOM components
list:
[
{
"bom-ref": "tslib@1.14.1",
"type": "library",
"name": "tslib",
"version": "1.14.1",
}
]
In this case, we're not trying to represent the layout of the node_modules
directory, but instead just enumerating the distinct dependencies that comprise the project. This is how both cdxgen and the snyk SBOM command handle the issue of duplicate packages.
I think there are cases to be made for either of these solutions, but I'd like to know which of these best matches the output you'd expect to see in a valid SBOM?
It's been a while - but I'd strongly vote for option 2. If a technical identical dependency is included multiple times, it should only appear ONCE as a component in the SBOM. It can be referenced multiple times in the dependency section of the sbom though. Compare to what maven is also doing.
We use OWASP Dependency-Track and just updated it to v4.11.1. It now validates uploaded SBOMs and rejects those generated by npm:
[2024-09-16T04:37:17.479Z] [DependencyTrack] {"status":400,"title":"The uploaded BOM is invalid","detail":"Schema validation failed","errors":["$.components[320].externalReferences[2].url: does not match the iri-reference pattern must be a valid RFC 3987 IRI-reference","$.components[320].externalReferences[2].url: does not match the iri-reference pattern must be a valid RFC 3987 IRI-reference","$.components[320].externalReferences[2].url: does not match the regex pattern ^urn:cdx:[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/[1-9][0-9]*$","$.components[320].externalReferences[2].url: does not match the iri-reference pattern must be a valid RFC 3987 IRI-reference","$.components[320].externalReferences[2].url: does not match the regex pattern ^urn:cdx:[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/[1-9][0-9]*#.+$","$.dependencies: the items in the array must be unique"]}
It sounds like the problem is duplicate entries.
The official tooling does not exhibit this problem.
You may want to try CycloneDX's official SBOM generator for NPM - it is properly maintained and does not have those issues.
Is there an existing issue for this?
This issue exists in the latest npm version
Current Behavior
The generated CycloneDX SBOM may not be able to be parsed by tools, as it generates duplicate dependencies.
Expected Behavior
A CycloneDX v1.5 SBOM generated from a repository can be parsed correctly.
Steps To Reproduce
npm sbom --sbom-format cyclonedx > cyclonedx.json
go run github.com/CycloneDX/sbom-utility@latest validate --input-file cyclonedx.json
renovate-graph.cyclonedx.json
Environment
//registry.npmjs.org/:_authToken = (protected)
; node bin location = /usr/bin/node ; node version = v18.17.1 ; npm local prefix = /home/jamie/workspaces/renovate-graph ; npm version = 10.2.3 ; cwd = /home/jamie/workspaces/renovate-graph ; HOME = /home/jamie ; Run
npm config ls -l
to show all defaults.