Closed DarthHater closed 2 years ago
I suppose another alternative is to provide a flag that just ignores licensing data (which is hyper specific to our case), but I don't personally like that.
The node-optimist library contains an invalid license in package.json. It is a requirement that the license defined in package.json be a valid SPDX license ID or a valid SPDX expression. This library is so old, that this requirement likely wasn't enforced when this component was last published.
https://spdx.org/licenses/MIT.html https://spdx.org/licenses/X11.html
It appears this library is dual licensed which should be declared in the package.json, but it's not.
Unresolved licenses like this will appear as a license 'name' in CycloneDX rather than a license ID. License ID's only resolve to valid SPDX license IDs and are not required by the spec. If Nexus IQ requires a license id, then it is violating the spec. This is an issue you'll need corrected by Sonatype.
It's actually not dual licensed when you read about MIT/X11
, it's just improperly declared. X11
is MIT with an extra sentence.
Thanks for clarifying. MIT/X11 is still an invalid SPDX license ID, therefore the license name is used instead of the id.
Took a look at the spec and you are right (didn't expect you to be wrong, by the way).
I'm trying to think of a middle ground, personally.
if (spdxLicenses.some(v => { return l === v; })) {
licenseContent.id = l;
} else {
licenseContent.id = 'UNKNOWN';
licenseContent.name = l;
}
Something like that seems to accomplish it, and indicate to someone they are using an Unknown license, which is I think accurate in this case.
Optimist is FAR from the only package where I saw this coming up, and being a lot of these projects are dead or predate some sort of standard, trying to finagle a middle ground.
The side benefit for someone using this to generate a SBOM is that they'd then know that whatever they are using has an old license and should probably be looked at with a bit more scrutiny?
Whatever value is defined in the license id field will be validated against https://cyclonedx.org/schema/spdx.xsd which is derived from https://github.com/spdx/license-list-data.
'UNKNOWN' is not a valid SPDX license ID. If you create a BOM with this value, it will fail to validate.
License strings which do not resolve to valid SPDX license id's, will always use the license name field. That's what the spec defines and that's what all official CycloneDX implementations produce.
So given license data is optional via the cyclonedx spec, can we provide a flag to just say "don't provide this data"? That seems like a good middle ground, and on spec.
Roughly half of all ecosystems do not enforce SPDX license ID's. Java (Maven) and .NET (Nuget) are prime examples of this. This is a very common scenario and certainly not unique to this library. The overwhelming majority of libraries in Maven Central and NuGet are not using SPDX license IDs.
SBOMs are a statement of fact. Removing facts goes against the primary objectives of the spec. The only exception that's been made is license text, which can greatly influence the size of the resulting BOM.
You're free to fork the project and create your own version which excludes facts. However, I still recommend solving the problem in Nexus IQ itself.
I work for Sonatype, if that wasn't clear, and I'm here because Nexus IQ requires an ID and that's a hard requirement. Forking a library is a pretty poor excuse for not working with the author to make it more flexible, and I don't see a reason why it shouldn't be, given the spec itself marks this entire section as not required. If someone wants to throw it away, they'd be making an intentional choice to do so.
Asking an independent open source author to make changes to every CycloneDX implementation to exclude details so that a commercial vendor can more easily do their job, is not a proper ask.
Dependency-Track attempts to resolve license names and does a decent job of it for certain licenses. If the license name cannot be resolved, then the license is discarded (as if it was never defined in the bom).
I am not asking you to make changes, I've already got multiple permutations of this fired up on my side. I'm asking you to accept changes I'm willing to make, because they seem to make sense, looking at the spec itself, and the quality of the data I can expect to get from the npm ecosystem. I've presented an array of potential solutions to the problem I'm encountering, and I feel like I'm thinking as much as I can about the spec, and reality that I'm running in to. I'm unwilling to fork this repo longterm because:
I get what you mean by spirit of the SBOM, and I'm sympathetic, but if something in the spec is marked as optional, giving someone the option to discard it seems fine, as it is done by intent, and doesn't violate the spec.
Also I have no clue what you mean by dropping the license text in your earlier comment, as I'm unaware of that option, and it doesn't appear to be documented. Please share that because it does indeed cause huge bloat (we are seeing 2-3mb SBOMs, and we don't need that data at all).
I'll drop this convo after this, but I feel like adding an option to discard the license data, by intention doesn't violate any part of the spec, and just adds additional flexibility to someone like me, the consumer.
So adding a flag to strip out non-required fields is out of the question? We can write the PR if you would accept it we just don't want to strip it after the fact in our module, but we will as this is currently breaking for us.
Excluding license text is a common ask and it's one that doesn't really take away from the facts, only the supporting evidence. It has been implemented in a few official implementations, but I don't think it's made its way to the node version yet. But it will.
Also keep in mind that there are numerous unofficial implementations as well, which include:
For example, a container security vendor that produces a BOM or ORT which produces a BOM are both capable of generating BOMs with npm modules. This is only a one implementation, but there isn't a guarantee that users will use this version. ORT for example, is extremely popular. I personally use it in some cases. That project comes specifically from a licensing background.
So even if this (or all official) implementation did have an option to exclude license, it doesn't solve the underlying issue. CycloneDX is a specification which just happens to have some implementations. There are many other ways to generate them, and many more currently in development, both publicly and internally at various organizations.
I'm aware of quite a few unofficial implementations, as I've crafted a few already (and been meaning to get in touch with you over if you want me to make them more generic and inherit them). I've got Golang and Conda specific ones at current time, but a bit bastardized just because I had to make things work in non traditional ways. The conda one is here if you wanted to take a gander:
https://github.com/sonatype-nexus-community/jake/tree/master/jake/cyclonedx/v1_1
Yeah best I can tell excluding license text hasn't made it here (I could add that in a heartbeat for you, if you'd like).
I'd really rather work with you to provide flexibility to this library rather than roll my own, and a few other reasons why:
Popping back in to summarize:
As well, I packaged it all up on my own fork of this project:
https://github.com/DarthHater/cyclonedx-node-module/tree/TypeScript
If you want I can send you a PR, and if you want I can strip out the contentious license data removal piece (although I obviously still think it's beneficial). The PR itself I think would be beneficial as it:
read-installed
, allowing someone to potentially use this project to create a bower sbom with minimal changes (an example object that can be passed is in the test)While I was at it I wired up an index.ts
that will create a cli app like the previous one in the bin, I couldn't figure out how that was created, so I've generated a cli app with similar flags, etc... or shown examples where I thought a flag wasn't needed.
Cheers, and thanks for providing this library, in general!
Thanks for the summary and the work you and the rest of the team has done with implementing it in your own ecosystems.
The project is always looking for contributions, and it sounds like you've made numerous improvements, both architecturally and in overall quality.
And although I cannot accept a PR which excludes license ID or name (but can accept exclusion of license text), I am certainly open to reviewing a PR. It would also likely be much easier to keep in sync if necessary.
There has been requests to support Bower (#35) so the architectural changes you made may make that possible in a future release.
reviewed the issue thwth the original MIT/X11
license of the given example.
cyclonedx-node-module currently detects the license and writes it as a non-SPDX-license correctly according to CycloneDX spec 1.3
behavior is as expected. Therefore this issue will be closed.
We are using this library in
auditjs
to generate an SBOM to send to Nexus IQ Server.We've run into an issue with a few libraries where the license is presented as something that isn't in your current list.
An example is:
https://github.com/substack/node-optimist/blob/master/package.json#L35
That license is declared as
MIT/X11
when in reality it should either beMIT X
(not in your list) orX11
When the sbom is created, a license section with a Name is created, just no ID, and this fails validation in Nexus IQ Server.
I'm curious if we should add these kinda odd license types to your list, or if a PR of some sort where if it can't find an ID it adds something that indicates the license is
Non-Conforming
or something akin?Thanks!