Open billie-alsup opened 1 month ago
This is just one example. Validating spdx json files generated from OpenEmbedded project yields numerous errors where aliases were used for licensing, versus the "official" node license text. In addition to GPL-2.0, there is GPL-3.0 and LGPL-2.1+. Again, this results in incorrect validation errors.
>>> from license_expression import get_spdx_licensing
>>> get_spdx_licensing().parse('LGPL-2.1+')
LicenseSymbol('LGPL-2.1-or-later', aliases=('LGPL-2.1+',), is_exception=False)
>>> get_spdx_licensing().parse('GPL-3.0')
LicenseSymbol('GPL-3.0-only', aliases=('GPL-3.0', 'LicenseRef-gpl-3.0'), is_exception=False)
>>>
>>> from license_expression import Licensing
>>> Licensing().parse('LGPL-2.1+')
LicenseSymbol('LGPL-2.1+', is_exception=False)
>>> Licensing().parse('GPL-3.0')
LicenseSymbol('GPL-3.0', is_exception=False)
>>>
2024-06-20 11:33:51,315:WARNING:root: Unrecognized license reference: LGPL-2.1+. license_expression must only use IDs from the license list or extracted licensing info, but is: LGPL-2.1+
I'm wondering if this is expected behavior (and you do not with to allow aliases), or if this is a bug.
I'm curious too. My own naive thought is that it would be nice if the validation processing accepted aliases :) That seems reasonable and consistent with the purpose of using aliases. But perhaps that creates bug or corner cases that I am not aware of?
src/spdx_tools/spdx/parser/jsonlikedict/license_expression_parser.py uses License().parse(expr) directly, rather than get_spdx_licensing().parse(expr) as used in parser/tagvalue/parser.py. The difference results in a different LicenseSymbol for GPl-2.0, e.g.
As you can see, GPL-2.0-only is the official name, and GPL-2.0 is an alias. However, when parsing directly with Licensing(), we get a GPL-2.0 node, rather than a GPL-2.0-only node. This causes problem later in validation, when GPL-2.0 comes back as an invalid symbol, e.g.
I'm wondering if this is expected behavior (and you do not with to allow aliases), or if this is a bug. Should I filter my json file in advance to switch to GPL-2.0-only ? Certainly GPL-2.0 should not be listed in the extracted_licensing_info section (as that would require changing it to LicenseRef-GPL-2.0 or similar), right?