CycloneDX / cyclonedx-python

CycloneDX Software Bill of Materials (SBOM) generator for Python projects and environments
https://cyclonedx.org
Apache License 2.0
256 stars 67 forks source link

Help with LicenseExpressionAlongWithOthersException #826

Open villaflaminio opened 2 hours ago

villaflaminio commented 2 hours ago

Hi I am experiencing an ambiguous case related to this issue .

The LicenseExpressionAlongWithOthersException is thrown if we have a LicenseExpression from a Python package and use the --gather-license-texts option at the same time.

Step to reproduce :

python3 -m venv venv_dependencies
python3 -m venv venv_cyclonedx

source venv_dependencies/bin/activate
pip install cryptography==43.0.1
deactivate

source venv_cyclonedx/bin/activate
pip install cyclonedx-bom==5.1.0
cyclonedx-py environment --PEP-639 --gather-license-texts -o cyclonedx-bom.json --sv 1.6 --of JSON venv_dependencies

Error:

CRITICAL | CDX > Found LicenseExpression along with others licenses in: <Component bom-ref=<BomRef 'cryptography==43.0.1' id=140025822015568>, group=None, name=cryptography, version=43.0.1, type=ComponentType.LIBRARY>

In the case of not including the --gather-license-texts option, it works perfectly!

Taking a closer look at what happens before the exception is raised there , this is the element content: Image How do you recommend handling this situation? Thank you for your support!

jkowalleck commented 2 hours ago

Thanks for reporting this.

this appears to be an unintended behaviour.

@villaflaminio , Could you help me craft a reproducible environment for this? Could you publish a github repository that could be used as a test subject?

jkowalleck commented 2 hours ago

A practical solution, i'd imagine: if the declared licence is an expression, then all declared licence files are used as licence evidence, instead of an expression.

this needs to be handled by the licence gathering in this very application.

jkowalleck commented 2 hours ago

@villaflaminio is this something you want to contribute a fix for? If so, please follow our contribution guidelines: https://github.com/CycloneDX/cyclonedx-python/blob/main/CONTRIBUTING.md Feel free to "draft" a pull request early, in case you need any help.

villaflaminio commented 2 hours ago

I don't know, for example in this case I only have one license expression but I have several text files to parse. And I could not insert multiple text fields for the various licenses in question. The toml on which I encountered the problem in question is this one The METADATA is something like :

Name: xyz
Version: 1.0.0
License-File: LICENSE
License-File: LICENSE.MIT
License-File: LICENSE.APACHE
License: Apache-2.0 OR BSD-3-Clause
Classifier: License :: OSI Approved :: Apache Software License
Classifier: License :: OSI Approved :: MITLicense

What kind of repository do you need? What do I need to include?

jkowalleck commented 2 hours ago

the thing is: the spec does not allow a mix of expression and named/id-licenses, yet. see https://github.com/CycloneDX/specification/issues/454

jkowalleck commented 2 hours ago

What kind of repository do you need? What do I need to include?

@villaflaminio, no additional information needed.


for reproducing: run analysis on cryptography==43.0.1. this is to be fixed in all implemented collectors: pypoject.toml and environment analysis.

villaflaminio commented 1 hour ago

Honestly, I can't think of "a clean way" to include this information without breaking the specification (especially according with the #454). I note that from the specification we do not even have the ‘Properties’ field available if we are talking about a ‘Licence Expression’; otherwise we could have added the text of the license files there, to ‘support’ the expression. Do I like it? Absolutely not!

But first of all, it seems to me necessary to avoid the entire extraction process being interrupted from this. Rather I would prefer that in the case where I have a license expression, the process of extracting the content is ignored.

jkowalleck commented 1 hour ago

Honestly, I can't think of "a clean way" to include this information without breaking the specification (especially according with the https://github.com/CycloneDX/specification/issues/454).

here is a possible solution: https://github.com/CycloneDX/cyclonedx-python/issues/826#issuecomment-2465195711

PS: I will drop you a "fixed" solution, just give me some minutes time to showcase this :D