DependencyTrack / dependency-track

Dependency-Track is an intelligent Component Analysis platform that allows organizations to identify and reduce risk in the software supply chain.
https://dependencytrack.org/
Apache License 2.0
2.69k stars 578 forks source link

Add support for SPDX v3 #1746

Open stevespringett opened 2 years ago

stevespringett commented 2 years ago

Current Behavior:

Proposed Behavior:

goneall commented 2 years ago

Thanks @stevespringett for adding the issue.

We are just about done with our first release candidate for the 2.3 spec.

It has a new field primaryPackagePurpose which is intended to cleanly map to the component types of CycloneDX. The documentation for this can be found here.

For the identifier, there is a unique ID for the package metadata (which translates to CDX component) which is the concatenation of the Document Namespace and the SPDX ID. We decided to use Package URL's for the ID of the artifact the SPDX package refers to. This has been added as an External Ref type in 2.3. It is currently optional for compatibility reasons, but we could generate a warning or make it required when ingesting into dependency track.

Please let me know if the above is not sufficient or if there is anything else needed. There's still time to influence the SPDX 2.3 release to help with any requirements for dependency track integration.

stevespringett commented 2 years ago

Thanks @goneall .

Unless I'm overlooking something, SPDX 2.3 still cannot state what a component is, rather, people have to use external reference identifiers which are "believed to be relevant to the Package" according to https://github.com/spdx/spdx-spec/blob/development/v2.3.1/chapters/how-to-use.md. There is no guidance on what to do when multiple identifiers of the same types are specified, or when conflicting identifiers or impossibilities are encountered.

The unique ID that was added which only supports PURL will also be a problem in that if a component only has a CPE, there's no way to represent it.

A future version of Dependency-Track will support evidence, including evidence of identity, along with where and how the evidence was obtained and the confidence of the evidence. This aligns to how many SCA products work, the OWASP SCVS BOM Maturity Model, and evidence support in CycloneDX v1.5. As of today, SPDX 2.3 falls somewhere in the middle. It cannot describe what something is, yet it also doesn't provide supporting evidence and confidence for its "beliefs". This makes supporting SPDX in Dependency-Track really challenging and was one of the technical reasons why SPDX support was previously removed.

For SPDX 2.3, I'm seeing ambiguity and overlapping definitions for the primaryPackagePurpose, some of which will make converting between SPDX and CDX more difficult. It will also make importing into Dependency-Track more difficult. I know the original intent was to aid in compatibility with CDX, however in practice, I don't think that's achieved. I'd be happy to discuss.

I also think there's some work that needs to be done with external reference categories, as they don't make a lot of sense IMO. For example, SWID is defined in the SECURITY category, yet I'm not aware of any use case that supports this today. SWID is supported by most CMDB discovery modules, such as ServiceNow, but these use cases have nothing to do with security. On the other hand, Package URL is defined in the 'Package-Manager" category and purl is used 20B times every month by Dependency-Track for security use cases. Does the category restrict what the identifier can be used for? What happens when the category is SECURITY and a purl is defined? (which I've seen many times). There's too much ambiguity here.

Props to @goneall on the new SPDX Java Library. It is 1000x better than the original. The design and code quality are outstanding. Seriously. Great job! One thing that will prevent adoption by Dependency-Track is its support for XML without having a released XML schema. There currently is none - see https://github.com/spdx/spdx-spec/tree/development/v2.3.1/schemas. Dependency-Track needs to be able to independently validate BOMs prior to processing.

IMO, there are still too many outstanding issues with SPDX 2.3. I know SPDX v3 solves some of these issues, and hopefully this feedback can be used to correct some of the other issues which may not have been previously considered, but which are important for Dependency-Track adoption.

goneall commented 2 years ago

@stevespringett

Unless I'm overlooking something, SPDX 2.3 still cannot state what a component is, rather, people have to use external reference identifiers which are "believed to be relevant to the Package" according to https://github.com/spdx/spdx-spec/blob/development/v2.3.1/chapters/how-to-use.md. There is no guidance on what to do when multiple identifiers of the same types are specified, or when conflicting identifiers or impossibilities are encountered.

As mentioned in the comment above, there is a unique ID for the package metadata (which translates to CDX component) which is the concatenation of the Document Namespace and the SPDX ID. This is an absolutely unique identifier for the SPDX package (component in the CDX terminology) if the SPDX spec is followed correctly. The intent of the external identifier is to allow correlation to other identification schemes, such as CPE however perfect or imperfect they may be. Since the SPDX identifiers are unique, you can use those when translating. Let me know if I'm missing something, but I think the unique SPDX identifier satisfies the translation requirements.

For SPDX 2.3, I'm seeing ambiguity and overlapping definitions for the primaryPackagePurpose, some of which will make converting between SPDX and CDX more difficult.

The primary package purpose was designed map directly to type property of the CDX component which should make this easier. I would be happy to discuss. There is a bit of a nuance in that CDX calls a File a component type when it is really more of a property of what we call a package in SPDX. In the future, we should probably discuss these aspects during the SPDX spec update - I'll try to ping you for review in future updates so we're in sync before the release.

I also think there's some work that needs to be done with external reference categories, as they don't make a lot of sense IMO.

I would suggest posting this to the SPDX spec - all input is welcome and we always appreciate improvements to the spec. In my opinion, the overlap of different external references creates ambiguities in the real world and representing those in a data spec is accurate and not an issue with the spec.

Props to @goneall on the new SPDX Java Library.

Thanks @stevespringett ! Appreciate the comments.

One thing that will prevent adoption by Dependency-Track is its support for XML without having a released XML schema

Completely agree - There has been some progress on this front, but I'm not much of an XML schema expert so could use some help on this front. There is an issue in the SPDX spec where were are making some progress. I'll make a point of doing some additional work on the XML schema over the next couple of months.

IMO, there are still too many outstanding issues with SPDX 2.3.

Since the 2.3 spec changes were intentionally designed to ease the conversions between CDX and SPDX, I'm hoping I can change your mind on this. @coderpatros and I have written utilities to convert between the formats. Although there is some loss in fidelity, I believe there is sufficient accuracy we can use the 2.3 spec IMHO.

mrutkows commented 9 months ago

The SPDX Tech. team has decided their SPDX 3.0 spec. sill be published canonically as a JSON-LD document, currently using a "Terse RDF Triple Language" (TTL) file format which can be found here:

There are no plans to produce a JSON schema or XML schema version (as of this writing) as per direct query to the tech. team. That exercise, I was told is left to RDF modelers such as one member uses (i.e., Apache Jena). There does not appear to be any existent libraries (open or not) that provide such translation and resolution of complex namespace and management of types (let alone validation).

goneall commented 9 months ago

There are some ongoing discussions in the SPDX tech community about possibly supporting a simpler JSON format of the spec complete with a JSON schema.

If there is interest in having a simpler JSON schema, I would suggest voicing your support in the serialization team meetings or in the mailing list once the RC2 version of the spec is released (hopefully soon).

If we do support the simpler JSON format, I plan to translate the canonical SHACL/OWL schema file into a JSON schema file using the SPDX Java tools.

Since JSON doesn't support the same level of semantic validation provided by SHACL/OWL we would still recommend using the canonical SHACL/OWL schema for full validation.

petit-perou commented 1 month ago

Hello,

I hope this message finds you well.

I am currently working with patent attorneys to provide and enhance legal security for software production and distribution. In order to achieve this we are thinking of adopting widely the use of (L)SBOMS (the L stands for legal). I am finding SPDX v3 easier to extend and versatile than cycloneDX (which is easier to use actually from a dev standpoint, imho) and am thinking to use in our processes.

I will try to attend (or will later read the notes) of the community meeting for the latest release later today to try and understand the roadmap of this project.

In any case I am wondering what the current status between SPDX v3 and the wider CycloneDX ecosystem (in which dependency-track is a major actor from what I can gather) fit-in.

In terms of contribution I would be happy to produce code to translate between formats and technical toolings to bridge the gap.

Best regards,

Antoine Zimmermann

goneall commented 1 month ago

As an update - The SPDX Java library has been updated to the SPDX specification 3.0.1 and can be used to help implement SPDX support in DependencyTrack. The latest version of the library has had some major changes and has only recently been made available, so there may be some issues.

If the SPDX Java Library is used, we can easily support SPDX 2.X and SPDX 3.X versions of the spec.

From taking a very cursory look at the dependency track code, it looks like we could create an SPDX ModelConverter similar to the CDX ModelConverter.java.

I personally won't have time to contribute the code for a few weeks, but I'd be more than happy to help support the effort.

petit-perou commented 1 month ago

Thanks for your reply, this is very encouraging. I will try and schedule some time to take a closer look at it this weekend or early next week.

Thanks for the pointers, this is very much appreciated.

If there are others relevant github issues you think I might want to take a look at feel free to point me towards it.

Have a great day!

Antoine Zimmermann