Open qduanmu opened 4 years ago
@qduanmu Hello and hank you again! I hope everything is OK for you!
In each case the text matched is "matched_text": "ZPL-2.0\","
if you run the scan with --license --license-text --license-text-diagnostics
and there are two instances so there are two detections alright.
We could:
create a rule with "ZPL-2.0", new LicenseData(licenseID: "ZPL-2.0"
but that would be weird as they are likely many more cases like that
create a false positive or a negative rule for most of the content of your file (I assume this is coming from this https://github.com/NuGet/NuGet.Client/blob/7bf0d060f3f1a680121ac17dbda01e6b15ef3b54/src/NuGet.Core/NuGet.Packaging/Licenses/NuGetLicenseData.cs ) but that would be also quite unwieldy too
design something new to match these few cases of code that contains a lot of licenses that are NOT the licenses of the code such as the one you have an issue with and many other such as https://github.com/jslicense/spdx-exceptions.json/blob/master/index.json or ... for instance scancode itself.
Both 1. and 2. would be quick fixes but would not be viable for the long term. I tend to think 3. is a better but harder approach. What do you think?
Jumping in a bit late. Stumbled on a file, gen.go, yesterday. There are two license texts in the file. Scancode (3.2.3 with -clipe
) reports the following for this file:
.....
"license_expressions": [
"apache-2.0",
"apache-2.0"
],
.....
"copyrights": [
{
"value": "Copyright 2019 The Wuffs Authors",
"start_line": 1,
"end_line": 1
},
{
"value": "Copyright 2019 The Wuffs Authors",
"start_line": 58,
"end_line": 58
}
],
"holders": [
{
"value": "The Wuffs Authors",
"start_line": 1,
"end_line": 1
},
{
"value": "The Wuffs Authors",
"start_line": 58,
"end_line": 58
}
],
....
So, _licenseexpressions, copyrights and holders are all stated (by authors) and reported (by Scancode) twice.
I am not sure multiple and verbatim copyright and/or license statements stated multiple times should be reported as one. OK, I admit the reason I looked at this issue was because I thought it was something spooky with Scancode and I did spend some time checking my scancode report analyser for errors.
Perhaps it simply should be up to the user (machine or human) to discard duplicate entries?
@hesa Hey! :wave: So I think this is a good case for effectively having a simplification here. There two notices alright and scancode detects them all correctly, but a post processing would do nicely!
Unrelated: May you should run the latest version? 3.2.3 starts to be old!
I think I would prefer to do this post processing myself (i.e. let scancode report the two instances). So, for me, this issue can be closed.
Re unrelated :)
ScanCode version 21.3.31
Thank you for your quick response, @pombredanne , hope everything goes well with you! I didn't work on this for quite a long time(may be back in near future), so I need to have a check on the latest scancode first.
Both 1. and 2. would be quick fixes but would not be viable for the long term. I tend to think 3. is a better but harder approach. What do you think?
I second the proposal 3., design something new(like a regex pattern/rule for above files, yes, this is a hard approach for files like https://github.com/jslicense/spdx-exceptions.json/blob/master/index.json) to filter out their license matching as false positives or even skip the file scanning. I will see if I could provide some more feedback after checking the latest update.
@qduanmu Hey :wave: !
hope everything goes well with you!
Thank you and yes, A-OK here ... and I hope for you too. At the moment I think I went with 2. and several false positive rules were added, but that's not a satisfying solution for the ong term. At least https://raw.githubusercontent.com/jslicense/spdx-exceptions.json/master/index.json reports no license are detected.
https://raw.githubusercontent.com/NuGet/NuGet.Client/7bf0d060f3f1a680121ac17dbda01e6b15ef3b54/src/NuGet.Core/NuGet.Packaging/Licenses/NuGetLicenseData.cs is still problematic though
@AyanSinhaMahapatra ^ FUIO you may have another idea for this issue?
The content of scanning file is:
{ "ZPL-2.0", new LicenseData(licenseID: "ZPL-2.0", isOsiApproved: true, isDeprecatedLicenseId: false, isFsfLibre: true) }
The license detection result is below: