Open manifestori opened 1 year ago
Annotators in v2.3 must be used to synthesize person/organization actors or tool elements in v3. This requires some guesswork since the annotator strings sometimes have email addresses in parentheses and sometimes not - you could assume there is only one Jane Doe in the universe, or treat each one with distinct emails (jd@gmail.com, Jane@outlook.com) as different Person elements.
@manifestori Thanks for opening this issue.
The annotation model lacks "name" and "description" fields, but has "Annotator" with "Person: Jane Doe" or similar format. Is it safe to say that "Annotator" could be converted to "name"?
The annotator should be converted to the creator of the Annotation
in the creationInfo
field.
I just updated the SPDX migration analysis document with a more complete recommendation for Annotation.
Feel free to add comments and suggestions to the migration guide.
Think about a tool converting between formats, if CDX has "Properties" (similar to element-level annotation) that have key=value, should one use statement to have key=value as value? or should one use name for key, and statement as value.
We have an opportunity with SPDX 3.0 to make this translation cleaner.
Should we add an optional properties map to the Annotation class? This would make the translation very clean. We could even add an Annotation type to indicate it was translated from CDX.
If there is interest in this approach, I can create a pull request later this week for consideration.
Another option is to use the extension property of Element for this purpose. However, it lacks structure and the fact it is a property of the Element means you can not add an extension after the Element is initially created.
@davaya @goneall thank you for the clarifications, I can now be sure of how you intend to provide support for annotation.
So if a SPDX2.3 Annotation looks like this:
{
"annotationDate" : "2010-01-29T18:30:22Z",
"annotationType" : "OTHER",
"annotator" : "Person: Jane Doe ()",
"comment" : "My annotation"
}
It would translate into
{
"type": "Annotation",
"spdxId": "spdx-example:SPDXRef-Annotation",
"annotationType": "other",
"subject": "spdx-example:SPDXRef-MyElement",
"statement": "My annotation,
"creationInfo": {
...,
"created": "2010-01-29T18:30:22Z",
"createdBy": [
"spdx-example:SPDXRef-Actor-JaneDoe"
],
"createdUsing": [...]
"profile": [...],
"dataLicense": "https://spdx.org/licenses/CC0-1.0",
"comment": "..."
},
}
Adding a properties map to annotation, that's makes a lot of sense as many tools want to have a simple structured way of adding metadata to components/packages/libs/files. Abusing statement
is not a great solution so a map would be most welcome and straightforward.
Regarding a "translated from cdx" annotation type, I don't think it serves a real purpose as a conversion tool would probably mention the fact it was converted from CDX by Tool X. or use the statement for it?
Maybe something more general such as metadata
instead of other
would make more sense?
"components": [
{
"bom-ref": "5a6ee31f-af7c-43bf-8ecc-09bc7fa997ad",
"type": "application",
"name": "examples/tests/requirements.txt",
"properties": [
{
"name": "aquasecurity:trivy:Class",
"value": "lang-pkgs"
},
{
"name": "aquasecurity:trivy:Type",
"value": "pip"
}
]
},
Looking at this snippet from CDX 1.5 components list, to translate it to CDX3.0 you would create an Element and an Annotation element associated with this element. The annotation
{
"type": "Annotation",
"spdxId": "spdx-example:SPDXRef-Annotation",
"annotationType": "other",
"subject": "spdx-example:SPDXRef-MyElement",
"statement": "", # what do we do with the value of statement?
"properties": {
"aquasecurity:trivy:Class": "lang-pkgs",
"aquasecurity:trivy:Type": "pip"
}
...
}
ofc you can make properties match the [{ name: string, value: string }] schema but this would consume more space.
Thanks again for your answers and open discussion <3
@goneall I would also be thrilled to contribute to SPDX3.0 myself :) If you can, please point me toward the way to make an RFC for the annotation spec that would be great.
if submitting a PR updating the models and docs is sufficient, I would be happy to do this ASAP :)
@manifestori - We'd certainly welcome your contributions :) The only thing required is to submit a PR to the spec. The CONTRIBUTING.md file gives a few more details.
We also have a mailing list and weekly meetings to discuss issues which are not easily resolvable through pull requests.
BTW - I'll be out hiking the next 4 days, so I may be a bit slow responding to issues, but there are several others in the community who can review / provide feedback.
@manifestori - back from hiking. I like the idea of the properties. The example you shared above looks good to me.
One question: What type do we want to use for the property name and value? If both are simple strings, we can re-use the existing DictionaryEntry class. If we want to restrict the property to be a URI type, then we would need to create a new class.
My preference would be to use strings to make the map more flexible.
Let me know if you'd like me to create a PR - I would like to get this into the 3.0 release.
Simple strings would do. DictionaryEntry sounds like the way to go <3
I will try to submit a pull request in the next few days. If I am unable to do so, I will get in touch with you. Thanks!
Since this is a non-breaking change and we don't have a PR in time for 3.0, I'm moving this to the 3.1 milestone
As for v2.3 schema: https://github.com/spdx/spdx-spec/blob/development/v2.3.1/schemas/spdx-schema.json
The annotation model lacks "name" and "description" fields, but has "Annotator" with "Person: Jane Doe" or similar format. Is it safe to say that "Annotator" could be converted to "name"?
if so, why in https://github.com/spdx/spdx-3-model/blob/main/serialization/json_ld/examples/converted_from_spdx_2.json, annotations that were converted from spdx2 don't have a name?
Moreover, I'm trying to understand the purpose of name/description for
Annotation
, why is it a "complete object? can those fields be optional?Think about a tool converting between formats, if CDX has "Properties" (similar to element-level annotation) that have key=value, should one use
statement
to havekey=value
as value? or should one usename
for key, andstatement
as value.I figure that there isn't "Property"-like structure in SPDX so Annotation is the closest and most appropriate candidate.
Going back to CDX from SPDX annotation would be affected by the same decision.
People have already tried approaching this in the past, but I rather have your opinion. https://github.com/spdx/cdx2spdx/blob/08ec34f11b15dd747410d13c1fc4d11645a20a4b/src/main/java/org/spdx/cdx2spdx/CycloneSpdxConverter.java#L565C60-L565C60