Open pepper-jk opened 2 years ago
@pepper-jk Hey! Thank for chiming in!
This is not a bug ... you have take a truncated version of the GPL text from SPDX that is known for modifying these texts.
You further truncated the GPL-2.0 by stripping this at the end "This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License."
This text is for the GPL 2.0 only alright.
And then you added some GPL 2.0 or later notice, this is GPL 2.0 or later correctly.
ScanCode detects the truncated GPL 2.0 text and your made up GPL 2.0 or later notice alright.
In practice folks do not copy/paste license text from the SPDX web site as they are notoriously damaged by the markup that's added. And also in practice folks do not put a GPL 2.0 or later notice at the bottom of a GPL 2.0 text. If anything this may be seen done the other way around: notice, then (full) text.
@pepper-jk re:
I am currently working on copyleft license compatibility rules for ORT.
I am not sure what you are trying to achieve and it looks definitely interesting, (I would like to hear more, please elaborate) but trying to make fake license texts/notice is not going to help IMHO. You should also check out @hesa https://github.com/vinland-technology/flict that may be vaguely related.
You should also check out @hesa https://github.com/vinland-technology/flict that may be vaguely related.
Lets not boost his ego too much. It might cause negative effects on his surroundings.
IMHO the scoring is a bit misleading in this example. Seeing a 100.0 score for only parts of the actual license is a bit misleading. I am aware that the test data might be flawed and therefore the issue a bit theoretical but this result had me worry a bit.
Shouldn't it affect the score if only a part of the license is matched?
Sorry if you have already explained this a gazillion times.
... you have take a truncated version of the GPL text from SPDX that is known for modifying these texts. ... In practice folks do not copy/paste license text from the SPDX web site as they are notoriously damaged by the markup that's added. And also in practice folks do not put a GPL 2.0 or later notice at the bottom of a GPL 2.0 text. If anything this may be seen done the other way around: notice, then (full) text.
Thanks for letting me know. I was unaware that the website, providing the (open source) license identifiers used by a lot of license validation tools, was modifying the license texts. That seems counter productive, if you ask me. (I could confirm it though.)
If I was to add a license to a real project, I would have gone for the original source obviously. But since this was just a test, I hurried things along. I did not intend to treat your project unfairly.
However, I am not sure that this "is not a bug".
On Monday, I will take the license from gnu.org and test it again. As you know how people use the license in practice, maybe you could link to a public github project using a GPL-2.0-or-later
license, I am happy to use the format and order of test they use to test scancode.
If I was to add a license to a real project, I would have gone for the original source obviously. But since this was just a test, I hurried things along. I did not intend to treat your project unfairly.
No worries, I never considered this report of yours to be a problem or unfair!
On Monday, I will take the license from gnu.org and test it again. As you know how people use the license in practice, maybe you could link to a public github project using a GPL-2.0-or-later license, I am happy to use the format and order of test they use to test scancode.
Yes, that's a good option, though there again, even the FSF --that otherwise demanded a "verbatim" reproduction of the GPL text -- has published several different minor versions of the GPL-2.0 text of the years making eventually the word verbatim somewhat not as clear. See https://github.com/pombredanne/gpl-history/
As for using some sample project, any project would work out. https://github.com/pyinstaller/pyinstaller licensing is mildly interesting for its variety of licenses and use of exceptions.
After adjusting the gpl-2.0-or-later
file as instructed, see commit above, I indeed get different score values:
- license: "GPL-2.0-only"
location:
path: "license-gpl-2.0-or-later.txt"
start_line: 3
end_line: 76
score: 98.99
- license: "GPL-2.0-or-later"
location:
path: "license-gpl-2.0-or-later.txt"
start_line: 3
end_line: 3
score: 100.0
So with ORT (old scancode) I get a 98.99
score for gpl-2.0-only
and with the current scancode I get 99.0
, see json below.
I think, this is still cutting it pretty close. The threshold to catch this problem would be 99.01
. With this threshold we would most likely miss a lot of more uncommon licenses.
The other solution from user side, would be to always check if gpl-2.0-only
and gpl-2.0-or-later
are in the same file and create a rule based on that, that we drop gpl-2.0-only
in that case.
I would propose such a rule be included in scancode, as this would be a pain to use either of the other ways.
@pepper-jk you may want to use this command for reviewing license scan details: scancode --license --license-text --license-text-diagnostics --yaml - <path to some file>
which is what I use to review finely matches. The words not matched with be enclosed in square brackets in the matched text.
As for the report you just made, you need to provide the actual text you used as otherwise, there is nothing practical that can be done.
Note that adding new rule is welcomed... but please make sure these are for rules that exist in the wild and not for made up, synthetic texts.
As for the report you just made, you need to provide the actual text you used as otherwise, there is nothing practical that can be done.
I updated the test in the repository I provided, as can be seen above in this issue. But here is the commit: https://github.com/pepper-jk/compatibility_test/commit/efda93ad073628450e5e889160aa0cc970d13b13
I removed the gnu instructions after "END OF TERMS AND CONDITIONS" and moved the or-later
to the top, as per your instructions. There should be no difference to the GPL now.
Note that adding new rule is welcomed... but please make sure these are for rules that exist in the wild and not for made up, synthetic texts.
I feel like we are having a misunderstanding. I took the instructions from you and the example you provided and create a plain gpl-2.0-later
license. And there is no issue in detecting the license, but with their weight in the results and the fact that both licenses are detected, even though it is just one.
Does scancode consider the or-later
in gpl-2.0-or-later
to be an exception
to gpl-2.0-only
(aka it needs to detect two licenses)?
If the license text was "made up" it would not score 99.0
or above. And if so, this should be considered a bug as well.
So from where I stand, this is not an issue of the formatting wording of the license anymore, but an issue with the scoring or the license detection.
@pepper-jk you may want to use this command for reviewing license scan details: scancode --license --license-text --license-text-diagnostics --yaml -
which is what I use to review finely matches. The words not matched with be enclosed in square brackets in the matched text.
I will try this tomorrow. But please start to consider, that this is indeed an issue.
The output of scancode --license --license-text --license-text-diagnostics --yaml gpl-2.0-or-later_diagnostics.yml license-gpl-2.0-or-later.txt
:
I removed the gnu instructions after "END OF TERMS AND CONDITIONS" and moved the or-later to the top, as per your instructions. There should be no difference to the GPL now.
If you removed part of the text that's a difference alright, and the score that will be returned will be only partial unless there is a rule for this text variant. As a side note the GPL states that this is a violation of the GPL to truncate or remove parts of the GPL text ;) .
I had not noticed until now that pyinstaller's GPL text is truncated ( @Legorooj FYI) !
If the license text was "made up" it would not score 99.0 or above. And if so, this should be considered a bug as well.
As it happens it is matched to an otherwise truncated copy of the GPL that was seen in a Linux kernel contribution circa 6 years ago per this commit https://github.com/nexB/scancode-toolkit/blame/8b02ed69cd0293b5813715d90bf934f4d43ff66e/src/licensedcode/data/rules/gpl-2.0_247.RULE . We keep track of weird text seen in the wild and we have this variant. The detection and its score (which is BTW based on the number of aligned matching words) may not be 100% in all cases. The detection score could be treated as an indication of detection ambiguity. In this truncated example there is no ambiguity at all that the GPL applies.
Does scancode consider the or-later in gpl-2.0-or-later to be an exception to gpl-2.0-only (aka it needs to detect two licenses)?
ScanCode detects texts using diffs against it database of texts. There are no relationships between licenses: they are entirely independent things, just a bunch of texts.
Note that the GPL-2.0-or-later
does not have a text that is different from the plain GPL 2.0
text at the FSF: there is only one GPL 2.0 text. What is different is an additional notice, statement or similar provided by the authors that the GPL-2.0 or any later version applies; like the one you added on top of the truncated GPL text.
As it happens it is matched to an otherwise truncated copy of the GPL that was seen in a Linux kernel contribution circa 6 years ago per this commit https://github.com/nexB/scancode-toolkit/blame/8b02ed69cd0293b5813715d90bf934f4d43ff66e/src/licensedcode/data/rules/gpl-2.0_247.RULE . We keep track of weird text seen in the wild and we have this variant. The detection and its score (which is BTW based on the number of aligned matching words) may not be 100% in all cases. The detection score could be treated as an indication of detection ambiguity. In this truncated example there is no ambiguity at all that the GPL applies.
Just to clarify that since this is not ambiguous we report a high score when matched. But since this is a truncated text, this is not considered as a proper GPL full text, but only tagged as a notice in https://github.com/nexB/scancode-toolkit/blob/ded56e9120f5fdfb9a1a0309130bb4305a66aacb/src/licensedcode/data/rules/gpl-2.0_247.yml#L2
ScanCode detects texts using diffs against it database of texts. There are no relationships between licenses: they are entirely independent things, just a bunch of texts.
Ok, so it just diffs against all (appropriate/possible) licenses? One by one?
Note that the
GPL-2.0-or-later
does not have a text that is different from the plainGPL 2.0
text at the FSF: there is only one GPL 2.0 text.
I am aware.
What is different is an additional notice, statement or similar provided by the authors that the GPL-2.0 or any later version applies; like the one you added on top of the truncated GPL text.
So my question was/is. For gpl-2.0-or-later
detection, does scancode diff only against the or-later-notice
or against the notice plus the full GPL 2.0 text? I would expect the later.
Further would I expect it to diff the whole license text against both gpl-2.0-or-later
and then against gpl-2.0-only
, as per your explanation above.
Atm it looks like, scancode just decides the gpl-2.0-only
diff looked better than the gpl-2.0-or-later
diff in the middle of the license text. Which makes 0 sense to me.
Please explain.
Just to clarify that since this is not ambiguous we report a high score when matched. But since this is a truncated text, this is not considered as a proper GPL full text, but only tagged as a notice in
Ok, in case of truncated texts, we can still have a high score, but the finding is marked as a notice
not a license
somewhere in the report? This is valuable information and should be utilized by ORT and other users.
If you removed part of the text that's a difference alright, and the score that will be returned will be only partial unless there is a rule for this text variant. As a side note the GPL states that this is a violation of the GPL to truncate or remove parts of the GPL text ;) .
I had not noticed until now that pyinstaller's GPL text is truncated ( @Legorooj FYI) !
Ok, I readded the "how to apply" section from gnu.org: https://github.com/pepper-jk/compatibility_test/commit/3474dd876b1c4ce77f6b3044b1f37639f3922d61
And I indeed get different results. GPL-2.0-only
has now a score of 86.62
. That is a huge difference and would make the threshold user option more usable. Of course, the best would be to detect GPL-2.0-or-later
only, not both licenses.
However, now gpl-1.0-plus
and lgpl-2.0-plus
are detected as well, with a score of 100
each:
Ok, I readded the "how to apply" section from gnu.org: pepper-jk/compatibility_test@3474dd8
And I indeed get different results. GPL-2.0-only has now a score of 86.62. That is a huge difference and would make the threshold user option more usable. Of course, the best would be to detect GPL-2.0-or-later only, not both licenses. However, now gpl-1.0-plus and lgpl-2.0-plus are detected as well, with a score of 100 each:
In pepper-jk/compatibility_test@3474dd8 , you now mixed parts of the LGPL 2.0 https://www.gnu.org/licenses/old-licenses/lgpl-2.0.txt bottom from this lines range https://gist.github.com/pombredanne/95ecbb5118e3bec8db343a0080947037#file-lgpl-2-0-txt-L439-file-lgpl-2-0-txt-L481 with a top of the GPL 2.0 from this lines range https://gist.github.com/pombredanne/bd3ce201c53fa895f108ea4ee45dfb18#file-gpl-2-0-txt-L1-L280 which leads to interesting and creative detections for sure none of which being fully conclusive as they should. Again made up, synthetic texts are not something we care to detect exactly. We do detect them as weird because that's what they are. In earnest, I find the detections you reported with this franken license as pretty accurate otherwise.
I really suggest that you try reusing as-is existing texts seen in actual open source projects and stop trying to make up frankenstein-like license notices.
Of course, the best would be to detect GPL-2.0-or-later only, not both licenses.
I welcome a PR with new rules combining both that correspond to actual seen-in-the-wild cases referenced by a URL. I will reject made up license texts and notices.
In pepper-jk/compatibility_test@3474dd8 , you now mixed parts of the LGPL 2.0
Oh. Thanks. I must have clicked on the wrong license, when retrieving this. I must admit, that I did this in haste today. Sorry about that mistake.
I have removed the LGPL license text. And to make sure the rest of the GPL is not "made up", I have replaced the whole license with the license found here: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html https://github.com/pepper-jk/compatibility_test/commit/d65fec951de92e96b6b204c78a05115658e5555b
So for the final results:
gpl-2.0-or-later
and gpl-2.0-only
are both detected with a score of 100
. So the issue persists, even with a correct and complete gpl-2.0-or-later
licenses.
I welcome a PR with new rules combining both that correspond to actual seen-in-the-wild cases referenced by a URL.
Before I could attempt writing a PR, I would need to know the answer to the questions I voiced above: https://github.com/nexB/scancode-toolkit/issues/3128#issuecomment-1293391487
I will reject made up license texts and notices.
I do not see the point in looking for a real world project. I have already shown (https://github.com/nexB/scancode-toolkit/issues/3128#issuecomment-1290628012) that the pyinstaller
GPL produced the same result, as this synthetic license. And I obviously can not use the whole license notice of pyinstaller
for a test, as it contains several GPL expections
and an apache
license attached, which would falsify the results.
The only way to test this is bug is with a "synthetic" GPL license, as it is supposed to be written. I have provided such a license and a scan now (https://github.com/nexB/scancode-toolkit/issues/3128#issuecomment-1294117170).
The only other way would be to look for a project that only uses this license and nothing else. I am not aware of a project that uses GPL-2.0-or-later
and I will not waste my time looking for one, when I have a perfectly reproducible "synthetic" benchmark for this bug.
@pepper-jk @jens-erdmann if you are interested to learn how the license matching process works please see https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/README.rst There are also major reworks under way in https://github.com/nexB/scancode-toolkit/pull/2961 described in https://github.com/nexB/scancode-toolkit/issues/2878
@pepper-jk re:
I do not see the point in looking for a real world project. [...] The only way to test this is bug is with a "synthetic" GPL license, as it is supposed to be written.
I am at loss to help there. We do not do theoretical scanning as someone may suppose, but only practical scanning based on actually observed practices.
If you think you can find a way to conclude automatically that the notice at https://github.com/pyinstaller/pyinstaller/blob/93446fffd58ae33992031902607daa6a0ffc1996/COPYING.txt#L11-L13 and the license text fragment at https://github.com/pyinstaller/pyinstaller/blob/93446fffd58ae33992031902607daa6a0ffc1996/COPYING.txt#L110-L396 are in fact a single license statement, I will welcome a solution by all means.
We have new ways in https://github.com/nexB/scancode-toolkit/pull/2961 to combine multiple matches in derived larger detections and that could be a place to do this this kind of combinations, but then again it would have to be based on actual observed-in-the-wild, common enough cases to be worth the effort.
[...] I will not waste my time looking for one
Fair enough. I will not waste my time trying to help then.
I am at loss to help there. We do not do theoretical scanning as someone may suppose, but only practical scanning based on actually observed practices.
I do not understand what you assume the "observed practice" is, asides from the practice you described as "putting the or-later notice in the front" and "not truncating the license" (one of which we already observed in the wild).
pyinstaller
, which you pointed me to, and is actual observed-in-the-wild, but which had a truncated license as well. [from the wild]or-later
notice in the beginning, like pyinstaller
(contrary to the spdx version). [official source + "best practice"]Over all three scans, we have seen they get very similar results, showing the same issue. The evidence is sufficient.
If you think you can find a way to conclude automatically that the notice at pyinstaller/pyinstaller@
93446ff
/COPYING.txt#L11-L13 and the license text fragment at pyinstaller/pyinstaller@93446ff
/COPYING.txt#L110-L396 are in fact a single license statement, I will welcome a solution by all means.We have new ways in #2961 to combine multiple matches in derived larger detections and that could be a place to do this this kind of combinations, but then again it would have to be based on actual observed-in-the-wild, common enough cases to be worth the effort.
I might look into this, if I find the time.
@pepper-jk @jens-erdmann if you are interested to learn how the license matching process works please see
develop
/src/licensedcode/README.rst There are also major reworks under way in #2961 described in #2878
Thanks for the information.
@pombredanne I'd like to apologize for my behavior last week. I grew frustrated while looking into the issue. But I should not have let that influence the discussion. I'll try to do better in the future.
Could you let me know if the evidence I provided above is sufficient to submit a pull request?
In case it is not, would the following tests be sufficient:
pyinstaller
license as is (including the truncated GPL-2.0-or-later
)pyinstaller
license after adding the missing GPL
text@pepper-jk Thanks! but no worries, I am cool! Your approach seems sensible. Please submit the PR and I can best comment it there.
Background
I am currently working on copyleft license compatibility rules for ORT. For this I created a test repository containing
GPL-2.0-or-later
andGPL-3.0
(later I added some CC licenses as well, but those are not relevant to the issue). Both license texts were copied from spdx.org.ORT issue: https://github.com/oss-review-toolkit/ort/issues/5967
Description of Bug
When scanning said test repository with ORT, scancode detects both
GPL-2.0-only
andGPL-2.0-or-later
, even though onlyGPL-2.0-or-later
was added to the repo:As you can see scancode gives
GPL-2.0-only
a98.02
score, even though it ignores the last 4 lines of the license text. AndGPL-2.0-or-later
only is detected in the last four lines.Later I ran a scan directly with scancode, getting the same results:
complete json
``` { "headers": [ { "tool_name": "scancode-toolkit", "tool_version": "31.2.1", "options": { "input": [ "." ], "--copyright": true, "--info": true, "--json-pp": "report-scancode.json", "--license": true, "--package": true, "--verbose": true }, "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.", "start_timestamp": "2022-10-19T140338.813472", "end_timestamp": "2022-10-19T140353.061902", "output_format_version": "2.0.0", "duration": 14.248444318771362, "message": null, "errors": [], "warnings": [], "extra_data": { "system_environment": { "operating_system": "linux", "cpu_architecture": "64", "platform": "Linux-5.15.0-50-generic-x86_64-with-glibc2.35", "platform_version": "#56-Ubuntu SMP Tue Sep 20 13:23:26 UTC 2022", "python_version": "3.10.6 (main, Aug 10 2022, 11:40:04) [GCC 11.3.0]" }, "spdx_license_list_version": "3.17", "files_count": 5 } } ], "dependencies": [], "packages": [], "files": [ { "path": "compatibility_test", "type": "directory", "name": "compatibility_test", "base_name": "compatibility_test", "extension": "", "size": 0, "date": null, "sha1": null, "md5": null, "sha256": null, "mime_type": null, "file_type": null, "programming_language": null, "is_binary": false, "is_text": false, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [], "license_expressions": [], "percentage_of_license_text": 0, "copyrights": [], "holders": [], "authors": [], "package_data": [], "for_packages": [], "files_count": 5, "dirs_count": 0, "size_count": 94469, "scan_errors": [] }, { "path": "compatibility_test/CC-BY-NC-SA-3.0.txt", "type": "file", "name": "CC-BY-NC-SA-3.0.txt", "base_name": "CC-BY-NC-SA-3.0", "extension": ".txt", "size": 21448, "date": "2022-10-18", "sha1": "7295cb93cd11ad9912bbc495a3ef6d7a91cdb44c", "md5": "666f6d1f58d456548a3156f86e7a3146", "sha256": "0abe2645856e5e739bf858bbedc527bb58d053f628880c9e9a26ef85d2f7c713", "mime_type": "text/plain", "file_type": "ASCII text, with very long lines", "programming_language": null, "is_binary": false, "is_text": true, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [ { "key": "cc-by-nc-sa-3.0", "score": 100.0, "name": "Creative Commons Attribution Non-Commercial Share Alike License 3.0", "short_name": "CC-BY-NC-SA-3.0", "category": "Source-available", "is_exception": false, "is_unknown": false, "owner": "Creative Commons", "homepage_url": "http://creativecommons.org/licenses/by-nc-sa/3.0/", "text_url": "http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode", "reference_url": "https://scancode-licensedb.aboutcode.org/cc-by-nc-sa-3.0", "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/cc-by-nc-sa-3.0.LICENSE", "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/cc-by-nc-sa-3.0.yml", "spdx_license_key": "CC-BY-NC-SA-3.0", "spdx_url": "https://spdx.org/licenses/CC-BY-NC-SA-3.0", "start_line": 3, "end_line": 63, "matched_rule": { "identifier": "cc-by-nc-sa-3.0_47.RULE", "license_expression": "cc-by-nc-sa-3.0", "licenses": [ "cc-by-nc-sa-3.0" ], "referenced_filenames": [], "is_license_text": true, "is_license_notice": false, "is_license_reference": false, "is_license_tag": false, "is_license_intro": false, "has_unknown": false, "matcher": "1-hash", "rule_length": 3360, "matched_length": 3360, "match_coverage": 100.0, "rule_relevance": 100 } } ], "license_expressions": [ "cc-by-nc-sa-3.0" ], "percentage_of_license_text": 100.0, "copyrights": [], "holders": [], "authors": [], "package_data": [], "for_packages": [], "files_count": 0, "dirs_count": 0, "size_count": 0, "scan_errors": [] }, { "path": "compatibility_test/CC-BY-NC-SA-4.0.txt", "type": "file", "name": "CC-BY-NC-SA-4.0.txt", "base_name": "CC-BY-NC-SA-4.0", "extension": ".txt", "size": 19066, "date": "2022-10-18", "sha1": "c7b58be452219bd8b74710dd9e6e7ed449517da0", "md5": "4a206eed80a3482b8fbf26350ead4538", "sha256": "f5992db54c7473dcda6f16a40679d305aa31440489fdcb1eaca84d7da6290ee4", "mime_type": "text/plain", "file_type": "UTF-8 Unicode text, with very long lines", "programming_language": null, "is_binary": false, "is_text": true, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [ { "key": "cc-by-nc-sa-4.0", "score": 100.0, "name": "Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License", "short_name": "CC-BY-NC-SA-4.0", "category": "Source-available", "is_exception": false, "is_unknown": false, "owner": "Creative Commons", "homepage_url": "http://creativecommons.org/licenses/by-nc-sa/4.0/", "text_url": "http://creativecommons.org/licenses/by-nc-sa/4.0/legalcode", "reference_url": "https://scancode-licensedb.aboutcode.org/cc-by-nc-sa-4.0", "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/cc-by-nc-sa-4.0.LICENSE", "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/cc-by-nc-sa-4.0.yml", "spdx_license_key": "CC-BY-NC-SA-4.0", "spdx_url": "https://spdx.org/licenses/CC-BY-NC-SA-4.0", "start_line": 3, "end_line": 117, "matched_rule": { "identifier": "cc-by-nc-sa-4.0_23.RULE", "license_expression": "cc-by-nc-sa-4.0", "licenses": [ "cc-by-nc-sa-4.0" ], "referenced_filenames": [], "is_license_text": true, "is_license_notice": false, "is_license_reference": false, "is_license_tag": false, "is_license_intro": false, "has_unknown": false, "matcher": "1-hash", "rule_length": 2861, "matched_length": 2861, "match_coverage": 100.0, "rule_relevance": 100 } } ], "license_expressions": [ "cc-by-nc-sa-4.0" ], "percentage_of_license_text": 100.0, "copyrights": [], "holders": [], "authors": [], "package_data": [], "for_packages": [], "files_count": 0, "dirs_count": 0, "size_count": 0, "scan_errors": [] }, { "path": "compatibility_test/license-gpl-2.0-or-later.txt", "type": "file", "name": "license-gpl-2.0-or-later.txt", "base_name": "license-gpl-2.0-or-later", "extension": ".txt", "size": 18229, "date": "2022-10-18", "sha1": "040cfa18ce31bbc2748537fe3f8aedfb25af6165", "md5": "d736fb04757076bd90e1677e9aa37230", "sha256": "1cc9cfdb6b5d3737e3c672d938c0e1ed7070ed1ed631b4e881076e1eabaafccf", "mime_type": "text/plain", "file_type": "ASCII text, with very long lines", "programming_language": null, "is_binary": false, "is_text": true, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [ { "key": "gpl-2.0", "score": 98.02, "name": "GNU General Public License 2.0", "short_name": "GPL 2.0", "category": "Copyleft", "is_exception": false, "is_unknown": false, "owner": "Free Software Foundation (FSF)", "homepage_url": "http://www.gnu.org/licenses/gpl-2.0.html", "text_url": "http://www.gnu.org/licenses/gpl-2.0.txt", "reference_url": "https://scancode-licensedb.aboutcode.org/gpl-2.0", "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0.LICENSE", "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0.yml", "spdx_license_key": "GPL-2.0-only", "spdx_url": "https://spdx.org/licenses/GPL-2.0-only", "start_line": 3, "end_line": 110, "matched_rule": { "identifier": "gpl-2.0.LICENSE", "license_expression": "gpl-2.0", "licenses": [ "gpl-2.0" ], "referenced_filenames": [], "is_license_text": true, "is_license_notice": false, "is_license_reference": false, "is_license_tag": false, "is_license_intro": false, "has_unknown": false, "matcher": "3-seq", "rule_length": 2931, "matched_length": 2873, "match_coverage": 98.02, "rule_relevance": 100 } }, { "key": "gpl-2.0-plus", "score": 100.0, "name": "GNU General Public License 2.0 or later", "short_name": "GPL 2.0 or later", "category": "Copyleft", "is_exception": false, "is_unknown": false, "owner": "Free Software Foundation (FSF)", "homepage_url": "http://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html", "text_url": "http://www.gnu.org/licenses/old-licenses/gpl-2.0-standalone.html", "reference_url": "https://scancode-licensedb.aboutcode.org/gpl-2.0-plus", "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0-plus.LICENSE", "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0-plus.yml", "spdx_license_key": "GPL-2.0-or-later", "spdx_url": "https://spdx.org/licenses/GPL-2.0-or-later", "start_line": 110, "end_line": 114, "matched_rule": { "identifier": "gpl-2.0-plus_420.RULE", "license_expression": "gpl-2.0-plus", "licenses": [ "gpl-2.0-plus" ], "referenced_filenames": [], "is_license_text": false, "is_license_notice": true, "is_license_reference": false, "is_license_tag": false, "is_license_intro": false, "has_unknown": false, "matcher": "2-aho", "rule_length": 113, "matched_length": 113, "match_coverage": 100.0, "rule_relevance": 100 } } ], "license_expressions": [ "gpl-2.0", "gpl-2.0-plus" ], "percentage_of_license_text": 98.94, "copyrights": [ { "copyright": "Copyright (c) 1989, 1991 Free Software Foundation, Inc.", "start_line": 6, "end_line": 6 }, { "copyright": "copyrighted by the Free Software Foundation", "start_line": 69, "end_line": 69 } ], "holders": [ { "holder": "Free Software Foundation, Inc.", "start_line": 6, "end_line": 6 }, { "holder": "the Free Software Foundation", "start_line": 69, "end_line": 69 } ], "authors": [], "package_data": [], "for_packages": [], "files_count": 0, "dirs_count": 0, "size_count": 0, "scan_errors": [] }, { "path": "compatibility_test/license-gpl-3.0.txt", "type": "file", "name": "license-gpl-3.0.txt", "base_name": "license-gpl-3.0", "extension": ".txt", "size": 35405, "date": "2022-10-18", "sha1": "579b08f7066f9491391a5eb2e9f238a71f4d4981", "md5": "2103bb15c50dc81f64b2315ab249ad3d", "sha256": "44d90a331f505bd19b626cef0e3ae59da08830cf8b9efbde5f71025faa3c463b", "mime_type": "text/plain", "file_type": "UTF-8 Unicode text, with very long lines", "programming_language": null, "is_binary": false, "is_text": true, "is_archive": false, "is_media": false, "is_source": false, "is_script": false, "licenses": [ { "key": "gpl-3.0", "score": 99.96, "name": "GNU General Public License 3.0", "short_name": "GPL 3.0", "category": "Copyleft", "is_exception": false, "is_unknown": false, "owner": "Free Software Foundation (FSF)", "homepage_url": "http://www.gnu.org/licenses/gpl-3.0.html", "text_url": "http://www.gnu.org/licenses/gpl-3.0-standalone.html", "reference_url": "https://scancode-licensedb.aboutcode.org/gpl-3.0", "scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-3.0.LICENSE", "scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-3.0.yml", "spdx_license_key": "GPL-3.0-only", "spdx_url": "https://spdx.org/licenses/GPL-3.0-only", "start_line": 3, "end_line": 211, "matched_rule": { "identifier": "gpl-3.0_466.RULE", "license_expression": "gpl-3.0", "licenses": [ "gpl-3.0" ], "referenced_filenames": [], "is_license_text": true, "is_license_notice": false, "is_license_reference": false, "is_license_tag": false, "is_license_intro": false, "has_unknown": false, "matcher": "3-seq", "rule_length": 5612, "matched_length": 5610, "match_coverage": 99.96, "rule_relevance": 100 } } ], "license_expressions": [ "gpl-3.0" ], "percentage_of_license_text": 99.95, "copyrights": [ { "copyright": "Copyright (c) 2007 Free Software Foundation, Inc.How To Reproduce
System configuration
Also occurs with ORT, which uses
ScanCode 30.1.0
or older (the ort-image we use in the pipeline is older then the current ORT master).