spdx / ntia-conformance-checker

Check SPDX SBOM for NTIA minimum elements
Apache License 2.0
52 stars 19 forks source link

licenseId is not a case-sensitive as per spdx ,but this tool follows case-sensitve? #165

Closed Bollineni-nitish-kumar closed 4 months ago

Bollineni-nitish-kumar commented 10 months ago

Firstly,thanks for the tool,it greatly helps.

As per spdx documentation licenseId is not a case-sensitive. link

Actually ,in the "hasExtractedLicensingInfos" element we will add all the extracted LicenseId's . I have two license id's i.e (LicenseRef-New-BSD-license) and the other is (LicenseRef-New-BSD-License).Here with case-insensitive both are same licenseId's only.so added only one id (i.e LicenseRef-New-BSD-License) to the "hasExtractedLicensingInfos".

Then ,when i check with ntia tool,it throws error saying "Unrecognized license reference: LicenseRef-New-BSD-license. license_expression must only use IDs from the license list or extracted licensing info, but is: LicenseRef-New-BSD-license".

I validated sbom with spdx tool with having only one id,it shows as a valid document.I am adding screeshots below.

Screenshot from 2023-11-30 18-48-29

Screenshot from 2023-11-30 18-46-41

jspeed-meyers commented 10 months ago

Thank you for the bug report, @Bollineni-nitish-kumar! We'll investigate.

In the meantime, PRs welcome, from you or anyone.

cc @goneall

jspeed-meyers commented 10 months ago

@Bollineni-nitish-kumar: I realize this may not be practical, but are you willing to share the SBOM you were analyzing to help me reproduce and potentially fix this issue?

jspeed-meyers commented 10 months ago

Changing the case of the license does seem to matter. See PR #166. I don't think this has to do with the code in ntia-conformance-checker though. I believe the spdx-tools library underpinning ntia-conformance-checker is responsible for this logic.

@zvr: Any thoughts?

Also, I noticed this statement in the SPDX link related to licenses:

However, please be aware that it is often important to match with the case of the canonical identifier on the SPDX License List. This is because the canonical identifier's case is used in the URL of the license's or exception's entry on the List, and because the canonical identifier is translated to a URI in RDF documents.

So perhaps the case should matter here. I admit that I am confused.

Bollineni-nitish-kumar commented 10 months ago

@jspeed-meyers : Heyy,I may not give you the exact sbom as it belongs to organisation.I have created a sample to create this behaviour.

In the attached sbom sample,I have added two licenseId's i.e "LicenseRef-The-BSD-2-Clause-license" and "LicenseRef-The-BSD-2-Clause-License" in "hasExtractedLicensingInfos".In both licenseIds only difference is "l" is small case in first license and "L" is uppercase in second license in the last word "License".

With this sample ntia-checker doesn't throws any error.But with spdx tool throws exception saying license already exists.If you just remove one of the license from "hasExtractedLicensingInfos" ,then ntia-chcker tool will throws exception but spdx tool validates the document.

Sample spdx sbom : Samplesbomspdxntia.json

Screenshot from 2023-12-01 11-20-38 Screenshot from 2023-12-01 11-21-16

jspeed-meyers commented 10 months ago

Thank you, @Bollineni-nitish-kumar.

puneeth072003 commented 9 months ago

@goneall @jspeed-meyers I'd love to take up this issue but i might need some help.

jspeed-meyers commented 9 months ago

@puneeth072003: I'd be glad for some help. Thank you! 🙏

And asking questions is fine. I'll learn and debug along with you. Again, thank you!

puneeth072003 commented 9 months ago

@jspeed-meyers My thoughts, Changing this

license_ref_ids: List[str] = [license_ref.license_id for license_ref in document.extracted_licensing_info]

to

license_ref_ids: List[str] = [license_ref.license_id.lower() for license_ref in document.extracted_licensing_info]

in https://github.com/spdx/tools-python/blob/main/src/spdx_tools/spdx/validation/license_expression_validator.py (line no. 43) might fix this bug

testing this now

jspeed-meyers commented 9 months ago

@puneeth072003: Nice find. That looks promising.

@meretp: You and your fellow maintainers are the experts on spdx-tools. What do you think about @puneeth072003's thoughts above?

puneeth072003 commented 9 months ago

After making couple of changes in src/spdx_tools/spdx/validation/license_expression_validator.py, I was able to fix the errors and got the below output after validating the sample json file provided by @Bollineni-nitish-kumar

Is this SBOM NTIA minimum element conformant? False

Individual elements                            | Status
-------------------------------------------------------
All component names provided?                  | True
All component versions provided?               | True
All component identifiers provided?            | True
All component suppliers provided?              | False
SBOM author name provided?                     | True
SBOM creation timestamp provided?              | True
Dependency relationships provided?             | True

@jspeed-meyers What do you think about this output obtained?

jspeed-meyers commented 9 months ago

@puneeth072003, first, thank you for putting in a PR to tools-python. I'm impressed! I think we should wait for a response there. But assuming they merge that, we can then wait until tools-python cuts a new version and then ntia-conformance-checker can bump the version of tools-python it uses and then also cut a new release.

And this looks good to me. Again, nice work!

puneeth072003 commented 9 months ago

@jspeed-meyers You're welcome🤓

jspeed-meyers commented 4 months ago

I am going to close this issue since it entirely involves changes to upstream code.