Open pombredanne opened 3 years ago
Phillipe, yes uppercase/lowercase would be a good way to distinguish, and also binary, so having these as OR statements in the heuristics would be great, but I'll double check all the small license rules and check if this holds just to be sure.
There are also things to consider beyond the case such as:
See also:
See also #797
@chinyeungli FYI
Another case: "LICENSE.gpl.\n\n3."
should not be detected as gpl-3.0_rdesc_1.RULE because of the two empty lines.
The attached binary contains three false positive detections: false-positive-in-binaries.zip
headers:
- tool_name: scancode-toolkit
tool_version: 30.0.0
options:
input:
- false-positive-in-binaries.zip
--license: yes
--license-text: yes
--license-text-diagnostics: yes
--yaml: '-'
notice: |
Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
OR CONDITIONS OF ANY KIND, either express or implied. No content created from
ScanCode should be considered or used as legal advice. Consult an Attorney
for any legal advice.
ScanCode is a free software code scanning tool from nexB Inc. and others.
Visit https://github.com/nexB/scancode-toolkit/ for support and download.
start_timestamp: '2021-10-06T212057.579137'
end_timestamp: '2021-10-06T212059.366303'
output_format_version: 1.0.0
duration: '1.7871878147125244'
message:
errors: []
extra_data:
spdx_license_list_version: '3.14'
files_count: 1
files:
- path: false-positive-in-binaries.zip
type: file
licenses:
- key: apache-2.0
score: '95.0'
name: Apache License 2.0
short_name: Apache 2.0
category: Permissive
is_exception: no
is_unknown: no
owner: Apache Software Foundation
homepage_url: http://www.apache.org/licenses/
text_url: http://www.apache.org/licenses/LICENSE-2.0
reference_url: https://scancode-licensedb.aboutcode.org/apache-2.0
scancode_text_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/apache-2.0.LICENSE
scancode_data_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/apache-2.0.yml
spdx_license_key: Apache-2.0
spdx_url: https://spdx.org/licenses/Apache-2.0
start_line: 1
end_line: 1
matched_rule:
identifier: apache-2.0_388.RULE
license_expression: apache-2.0
licenses:
- apache-2.0
referenced_filenames: []
is_license_text: no
is_license_notice: no
is_license_reference: yes
is_license_tag: no
is_license_intro: no
has_unknown: no
matcher: 2-aho
rule_length: 1
matched_length: 1
match_coverage: '100.0'
rule_relevance: 95
matched_text: ALv2@
- key: lgpl-2.0-plus
score: '75.0'
name: GNU Library General Public License 2.0 or later
short_name: LGPL 2.0 or later
category: Copyleft Limited
is_exception: no
is_unknown: no
owner: Free Software Foundation (FSF)
homepage_url: http://www.gnu.org/licenses/old-licenses/lgpl-2.0.html
text_url: http://www.gnu.org/licenses/old-licenses/lgpl-2.0-standalone.html
reference_url: https://scancode-licensedb.aboutcode.org/lgpl-2.0-plus
scancode_text_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/lgpl-2.0-plus.LICENSE
scancode_data_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/lgpl-2.0-plus.yml
spdx_license_key: LGPL-2.0-or-later
spdx_url: https://spdx.org/licenses/LGPL-2.0-or-later
start_line: 3
end_line: 3
matched_rule:
identifier: lgpl_bare_single_word.RULE
license_expression: lgpl-2.0-plus
licenses:
- lgpl-2.0-plus
referenced_filenames: []
is_license_text: no
is_license_notice: no
is_license_reference: yes
is_license_tag: no
is_license_intro: no
has_unknown: no
matcher: 2-aho
rule_length: 1
matched_length: 1
match_coverage: '100.0'
rule_relevance: 75
matched_text: lGPl~=
- key: gpl-2.0
score: '50.0'
name: GNU General Public License 2.0
short_name: GPL 2.0
category: Copyleft
is_exception: no
is_unknown: no
owner: Free Software Foundation (FSF)
homepage_url: http://www.gnu.org/licenses/gpl-2.0.html
text_url: http://www.gnu.org/licenses/gpl-2.0.txt
reference_url: https://scancode-licensedb.aboutcode.org/gpl-2.0
scancode_text_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0.LICENSE
scancode_data_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/gpl-2.0.yml
spdx_license_key: GPL-2.0-only
spdx_url: https://spdx.org/licenses/GPL-2.0-only
start_line: 4
end_line: 4
matched_rule:
identifier: gpl2_bare_word_only.RULE
license_expression: gpl-2.0
licenses:
- gpl-2.0
referenced_filenames: []
is_license_text: no
is_license_notice: no
is_license_reference: no
is_license_tag: yes
is_license_intro: no
has_unknown: no
matcher: 2-aho
rule_length: 1
matched_length: 1
match_coverage: '100.0'
rule_relevance: 50
matched_text: GPL2\
license_expressions:
- apache-2.0
- lgpl-2.0-plus
- gpl-2.0
percentage_of_license_text: '50.0'
scan_errors: []
gPL and similar is a source of noisy false positive. @AyanSinhaMahapatra what's your take there?