spdx / tools-python

A Python library to parse, validate and create SPDX documents.
http://spdx.org
Apache License 2.0
178 stars 130 forks source link

url validation fails for gitsm #819

Closed billie-alsup closed 3 weeks ago

billie-alsup commented 1 month ago

The supported_download_repos list in validation/uri_validators.py is missing gitsm

supported_download_repos: str = "(git|hg|svn|bzr)"

Our OpenEmbedded build produces three SPDX files using gitsm:

recipes/recipe-bcc.spdx.json:      "downloadLocation": "gitsm+https://github.com/iovisor/bcc@942227484d3207f6a42103674001ef01fb5335a0",
recipes/recipe-ovmf.spdx.json:      "downloadLocation": "gitsm+https://github.com/tianocore/edk2.git@06dc822d045c2bb42e497487935485302486e151",
recipes/recipe-ovmf-native.spdx.json:      "downloadLocation": "gitsm+https://github.com/tianocore/edk2.git@06dc822d045c2bb42e497487935485302486e151",

gitsm is the bitbake submodule fetcher.

billie-alsup commented 3 weeks ago

I have looked at the SPDX 2.3.0 specification

In section 7.7.1 (Package download location field description), it simply mentions a URL, NONE, or NOASSERTION.

However, in section 7.7.3 Examples, it explicitly lists supported git schemes, and gitsm is not mentioned. So it seems that I need to handle this in the application, whether in the SPDX generator, or possibly pushing the problem back to the yocto environment. Certainly a given git SHA1 would be sufficient to identify the submodules' SHA1 as well, but I think it might be better to list each submodule (recursively) as an independent package, with independent supplier/originator/license/etc. Of course the relationship between the packages can be listed in the relationships section as well.