aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://github.com/aboutcode-org/scancode-toolkit/releases/
2.07k stars 536 forks source link

Got CC-BY-NC-SA-3.0 for file mentioning elements under CC-BY-SA-3.0 #3703

Open jloehel opened 5 months ago

jloehel commented 5 months ago

Description

This code snippet:

 /*
  * Simple square root algorithm. This is from:
  * http://stackoverflow.com/questions/1623375/writing-your-own-square-root-function
  * Written by Chihung Yu
  * Creative Commons license
  * http://creativecommons.org/licenses/by-sa/3.0/legalcode
  * It has been modified to compile correctly, and for U-Boot style.
  */
static double tt_sqrt(double value)
{
    double lo = 1.0;
    double hi = value;

    while (hi - lo > 0.00001) {
        double mid = lo + (hi - lo) / 2;

        if (mid * mid - value > 0.00001)
            hi = mid;
        else
            lo = mid;
    }

    return lo;
}

from https://github.com/Xilinx/u-boot-xlnx/blob/3290b109bfa70d65ed4ce49ed84afed0ed4335e0/drivers/video/console_truetype.c#L39 gets detected as CC-BY-NC-SA-3.0 instead of CC-BY-SA-3.0

How To Reproduce

user@laptop:~$ scancode --license --copyright --json-pp uboot-test.json ./drivers/video/console_truetype.c

I have tested it also with the patches from https://github.com/nexB/scancode-toolkit/pull/3644, but it does not fix the issue.

Logs

Log Snippet with SCANCODE_DEBUG_LICENSE=True:

LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_23.RULE, sc=54.55, cov=54.55, len=6, hilen=3, rlen=11, qreg=(87, 92), ireg=(1, 8)
...
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
...
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_1.RULE, sc=100.0, cov=100.0, len=8, hilen=2, rlen=8, qreg=(88, 95), ireg=(0, 7)

...

matches before final merge: 2
matches before final merge MATCHED TEXTS
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(43, 44), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
  MATCHED QUERY TEXT: Creative Commons license *
http://creativecommons.org/licenses/by-sa/3.0/legalcode
  MATCHED RULE TEXT: creative commons license <attribution> <noncommercial> <sharealike> <3> <0>
<unported> http creativecommons org licenses by <nc> sa 3 0 legalcode
LicenseMatch: 'gpl-2.0-plus', lines=(4, 4), matcher='1-spdx-id', rid=spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d, sc=100.0, cov=100.0, len=6, hilen=3, rlen=6, qreg=(5, 10), ireg=(0, 5)
  MATCHED QUERY TEXT: SPDX-License-Identifier: GPL-2.0+
  MATCHED RULE TEXT: spdx license identifier gpl 2 0+
final matches: 2
final matches MATCHED TEXTS
LicenseMatch: 'gpl-2.0-plus', lines=(4, 4), matcher='1-spdx-id', rid=spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d, sc=100.0, cov=100.0, len=6, hilen=3, rlen=6, qreg=(5, 10), ireg=(0, 5)
  MATCHED QUERY TEXT: SPDX-License-Identifier: GPL-2.0+
  MATCHED RULE TEXT: spdx license identifier gpl 2 0+
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(43, 44), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
  MATCHED QUERY TEXT: Creative Commons license *
http://creativecommons.org/licenses/by-sa/3.0/legalcode
  MATCHED RULE TEXT: creative commons license <attribution> <noncommercial> <sharealike> <3> <0>
<unported> http creativecommons org licenses by <nc> sa 3 0 legalcode

How do I need to interpret the log output? Is the token <nc> of the URL optional here? Why is the match for src/licensedcode/data/rules/cc-by-3.0_1.RULE with 100% cov not considered?

System configuration

pombredanne commented 5 months ago

Thanks for the report!

How do I need to interpret the log output? Is the token of the URL optional here?

The in angle brackets is for words from the rules that were NOT matched. All the license matching is word-based (actually each word is mapped to an integer internally and we manipulate sequences of ints)

Why is the match for src/licensedcode/data/rules/cc-by-3.0_1.RULE with 100% cov not considered?

It is shorter and contained in another match. Let me explain what happens.

You get these three matches:

The Match 1 attributes are:

Here is what happens:

  1. Match 3 is entirely contained (words-wise) in the longer Match 2 and therefore discarded
  2. Match 1 and 2 have the same score and significant overlap, yet Match 2 is longer by one word and has one more legalese words and hence wins
  3. You end up with Match 2.

FWIW, this is something @AyanSinhaMahapatra is working on to massively fix likely with https://github.com/nexB/scancode-toolkit/pull/3254

In the meantime the fix is going to be like in https://github.com/nexB/scancode-toolkit/pull/3644 The faulty rule is https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE and the cure will be to add required phrases and may be a new rule.

  1. add curly braces in https://github.com/nexB/scancode-toolkit/blob/develop/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE

    licensed under the Creative Commons License,
    {{Attribution-NonCommercial-ShareAlike 3.0 }} unported.
    {{http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode}}
  2. create a new rule for good measure for cc-by-sa-3.0 with this text.

    Creative Commons license
    {{ http://creativecommons.org/licenses/by-sa/3.0/legalcode }}
jloehel commented 5 months ago

Hi @pombredanne,

Thank you very much for the detailed explanation. It helps a lot to understand what's going on. I have tried the suggested quick fix:

From c5cac48ec306551531c06afc1c5ecc5227319ca7 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?J=C3=BCrgen=20L=C3=B6hel?= <juergen.loehel@xxx>
Date: Thu, 21 Mar 2024 11:04:18 -0600
Subject: [PATCH] Quick fix for #3703
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Jürgen Löhel <juergen.loehel@xxx>
---
 src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE |  4 ++--
 src/licensedcode/data/rules/cc-by-sa-3.0_104.RULE   | 10 ++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)
 create mode 100644 src/licensedcode/data/rules/cc-by-sa-3.0_104.RULE

diff --git a/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE b/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE
index 61ddf947d7..5e41d08f9d 100644
--- a/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE
+++ b/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE
@@ -7,5 +7,5 @@ ignorable_urls:
 ---

 licensed under the Creative Commons License,
-Attribution-NonCommercial-ShareAlike 3.0 unported.
-(http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode)
\ No newline at end of file
+{{Attribution-NonCommercial-ShareAlike 3.0 }} unported.
+{{http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode}}
diff --git a/src/licensedcode/data/rules/cc-by-sa-3.0_104.RULE b/src/licensedcode/data/rules/cc-by-sa-3.0_104.RULE
new file mode 100644
index 0000000000..a1eb4d3cc6
--- /dev/null
+++ b/src/licensedcode/data/rules/cc-by-sa-3.0_104.RULE
@@ -0,0 +1,10 @@
+---
+license_expression: cc-by-sa-3.0
+is_license_notice: yes
+relevance: 100
+ignorable_urls:
+    - http://creativecommons.org/licenses/by-sa/3.0/legalcode
+---
+
+Creative Commons license
+{{ http://creativecommons.org/licenses/by-sa/3.0/legalcode }}
-- 
2.35.3

But I still get this output (❯ cat uboot-test.json | jq '.files'):

[
  {
    "path": "console_truetype.c",
    "type": "file",
    "detected_license_expression": "gpl-2.0-plus",
    "detected_license_expression_spdx": "GPL-2.0-or-later",
    "license_detections": [
      {
        "license_expression": "gpl-2.0-plus",
        "license_expression_spdx": "GPL-2.0-or-later",
        "matches": [
          {
            "license_expression": "gpl-2.0-plus",
            "spdx_license_expression": "GPL-2.0-or-later",
            "from_file": "console_truetype.c",
            "start_line": 4,
            "end_line": 4,
            "matcher": "1-spdx-id",
            "score": 100,
            "matched_length": 6,
            "match_coverage": 100,
            "rule_relevance": 100,
            "rule_identifier": "spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d",
            "rule_url": null
          }
        ],
        "identifier": "gpl_2_0_plus-1dbf5e2e-83b4-624b-6a02-72ea175258bd"
      }
    ],
    "license_clues": [
      {
        "license_expression": "cc-by-nc-sa-3.0",
        "spdx_license_expression": "CC-BY-NC-SA-3.0",
        "from_file": "console_truetype.c",
        "start_line": 43,
        "end_line": 44,
        "matcher": "3-seq",
        "score": 54.55,
        "matched_length": 12,
        "match_coverage": 54.55,
        "rule_relevance": 100,
        "rule_identifier": "cc-by-nc-sa-3.0_23.RULE",
        "rule_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/cc-by-nc-sa-3.0_23.RULE"
      }
    ],
    "percentage_of_license_text": 0.83,
    "copyrights": [
      {
        "copyright": "Copyright (c) 2016 Google, Inc",
        "start_line": 2,
        "end_line": 2
      }
    ],
    "holders": [
      {
        "holder": "Google, Inc",
        "start_line": 2,
        "end_line": 2
      }
    ],
    "authors": [
      {
        "author": "Chihung Yu Creative",
        "start_line": 42,
        "end_line": 43
      }
    ],
    "scan_errors": []
  }
]

I will add the full log here:

 Setup plugin: scan:licenses...
 Setup plugin: scan:copyrights...
 Setup plugin: output:json-pp...
Collect file inventory...
Scan files for: licenses, copyrights with 1 process(es)...
Index.match: for: ./drivers/video/console_truetype.c query: <licensedcode.query.Query object at 0x7f4936238340>

match_query: matching with matcher: aho
matched with: aho: 2
LicenseMatch: 'gpl-2.0-plus', lines=(4, 4), matcher='2-aho', rid=spdx_license_id_gpl-2.0+_for_gpl-2.0-plus.RULE, sc=50.0, cov=100.0, len=3, hilen=1, rlen=3, qreg=(8, 10), ireg=(0, 2)
LicenseMatch: 'cc-by-sa-3.0', lines=(44, 44), matcher='2-aho', rid=cc-by-sa-3.0_7.RULE, sc=100.0, cov=100.0, len=9, hilen=3, rlen=9, qreg=(88, 96), ireg=(0, 8)

match_query: matching with matcher: spdx_lid
matched with: spdx_lid: 1
LicenseMatch: 'gpl-2.0-plus', lines=(0, 0), matcher='1-spdx-id', rid=spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d, sc=100.0, cov=100.0, len=6, hilen=3, rlen=6, qreg=(5, 10), ireg=(0, 5)

match_query: matching with matcher: seq
matched with: seq: 87
LicenseMatch: 'apache-2.0', lines=(0, 0), matcher='3-seq', rid=apache-2.0_92.RULE, sc=9.09, cov=9.09, len=1, hilen=1, rlen=11, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'apache-2.0', lines=(0, 0), matcher='3-seq', rid=apache-2.0_92.RULE, sc=9.09, cov=9.09, len=1, hilen=1, rlen=11, qreg=(91, 91), ireg=(7, 7)
LicenseMatch: 'blueoak-1.0.0', lines=(0, 0), matcher='3-seq', rid=blueoak-1.0.0_10.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'blueoak-1.0.0', lines=(0, 0), matcher='3-seq', rid=blueoak-1.0.0_10.RULE, sc=15.38, cov=15.38, len=2, hilen=1, rlen=13, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-1.0.RULE, sc=87.5, cov=87.5, len=7, hilen=3, rlen=8, qreg=(88, 96), ireg=(0, 7)
LicenseMatch: 'cc-by-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-1.0_1.RULE, sc=71.43, cov=71.43, len=5, hilen=2, rlen=7, qreg=(88, 92), ireg=(0, 4)
LicenseMatch: 'cc-by-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-1.0_20.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-1.0_20.RULE, sc=15.38, cov=15.38, len=2, hilen=1, rlen=13, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-2.0.RULE, sc=71.43, cov=71.43, len=5, hilen=2, rlen=7, qreg=(88, 92), ireg=(0, 4)
LicenseMatch: 'cc-by-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-2.0_2.RULE, sc=50.0, cov=50.0, len=6, hilen=3, rlen=12, qreg=(87, 92), ireg=(0, 9)
LicenseMatch: 'cc-by-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-2.0_26.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-2.0_26.RULE, sc=15.38, cov=15.38, len=2, hilen=1, rlen=13, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-2.0_29.RULE, sc=62.5, cov=62.5, len=5, hilen=2, rlen=8, qreg=(88, 92), ireg=(0, 4)
LicenseMatch: 'cc-by-2.5', lines=(0, 0), matcher='3-seq', rid=cc-by-2.5_10.RULE, sc=60.0, cov=60.0, len=6, hilen=3, rlen=10, qreg=(88, 96), ireg=(2, 9)
LicenseMatch: 'cc-by-3.0-us', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0-us_8.RULE, sc=35.71, cov=35.71, len=5, hilen=3, rlen=14, qreg=(87, 92), ireg=(5, 10)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0.RULE, sc=71.43, cov=71.43, len=5, hilen=2, rlen=7, qreg=(88, 92), ireg=(0, 4)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_1.RULE, sc=88.89, cov=100.0, len=8, hilen=3, rlen=8, qreg=(88, 96), ireg=(0, 7)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_121.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_121.RULE, sc=15.38, cov=15.38, len=2, hilen=1, rlen=13, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_126.RULE, sc=33.33, cov=33.33, len=4, hilen=2, rlen=12, qreg=(89, 92), ireg=(6, 9)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_128.RULE, sc=40.0, cov=40.0, len=4, hilen=2, rlen=10, qreg=(89, 92), ireg=(4, 7)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_19.RULE, sc=30.0, cov=30.0, len=6, hilen=3, rlen=20, qreg=(87, 92), ireg=(9, 17)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_23.RULE, sc=54.55, cov=54.55, len=6, hilen=3, rlen=11, qreg=(87, 92), ireg=(1, 8)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_25.RULE, sc=38.46, cov=38.46, len=5, hilen=2, rlen=13, qreg=(88, 92), ireg=(2, 6)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_43.RULE, sc=33.33, cov=33.33, len=5, hilen=3, rlen=15, qreg=(87, 92), ireg=(7, 12)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_48.RULE, sc=40.0, cov=40.0, len=6, hilen=3, rlen=15, qreg=(87, 92), ireg=(7, 12)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_59.RULE, sc=50.0, cov=50.0, len=6, hilen=3, rlen=12, qreg=(87, 92), ireg=(0, 5)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_7.RULE, sc=5.88, cov=5.88, len=1, hilen=1, rlen=17, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_7.RULE, sc=47.06, cov=47.06, len=8, hilen=2, rlen=17, qreg=(88, 95), ireg=(9, 16)
LicenseMatch: 'cc-by-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-3.0_98.RULE, sc=20.0, cov=20.0, len=5, hilen=3, rlen=25, qreg=(87, 92), ireg=(9, 19)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_14.RULE, sc=4.76, cov=4.76, len=1, hilen=1, rlen=21, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_14.RULE, sc=23.81, cov=23.81, len=5, hilen=2, rlen=21, qreg=(88, 92), ireg=(11, 15)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_25.RULE, sc=6.67, cov=6.67, len=1, hilen=1, rlen=15, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_25.RULE, sc=33.33, cov=33.33, len=5, hilen=2, rlen=15, qreg=(88, 92), ireg=(8, 12)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_26.RULE, sc=6.67, cov=6.67, len=1, hilen=1, rlen=15, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_26.RULE, sc=26.67, cov=26.67, len=4, hilen=2, rlen=15, qreg=(89, 92), ireg=(9, 12)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_40.RULE, sc=41.67, cov=41.67, len=5, hilen=3, rlen=12, qreg=(87, 92), ireg=(4, 9)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_70.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(11, 11)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_70.RULE, sc=46.15, cov=46.15, len=6, hilen=3, rlen=13, qreg=(89, 96), ireg=(2, 8)
LicenseMatch: 'cc-by-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-4.0_93.RULE, sc=25.0, cov=25.0, len=5, hilen=3, rlen=20, qreg=(87, 92), ireg=(12, 17)
LicenseMatch: 'cc-by-nc-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-3.0_3.RULE, sc=50.0, cov=50.0, len=6, hilen=3, rlen=12, qreg=(87, 92), ireg=(1, 8)
LicenseMatch: 'cc-by-nc-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-3.0_4.RULE, sc=33.33, cov=33.33, len=5, hilen=2, rlen=15, qreg=(88, 92), ireg=(6, 10)
LicenseMatch: 'cc-by-nc-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-2.0_3.RULE, sc=5.0, cov=5.0, len=1, hilen=1, rlen=20, qreg=(87, 87), ireg=(19, 19)
LicenseMatch: 'cc-by-nc-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-2.0_3.RULE, sc=20.0, cov=20.0, len=4, hilen=2, rlen=20, qreg=(89, 92), ireg=(11, 14)
LicenseMatch: 'cc-by-nc-sa-3.0-igo', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-3.0-igo_9.RULE, sc=10.0, cov=10.0, len=1, hilen=1, rlen=10, qreg=(91, 91), ireg=(0, 0)
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-3.0_3.RULE, sc=29.41, cov=29.41, len=5, hilen=2, rlen=17, qreg=(88, 92), ireg=(6, 10)
LicenseMatch: 'cc-by-nc-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-nc-sa-4.0_4.RULE, sc=28.57, cov=28.57, len=6, hilen=3, rlen=21, qreg=(87, 92), ireg=(7, 16)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0.RULE, sc=75.0, cov=75.0, len=6, hilen=2, rlen=8, qreg=(88, 93), ireg=(0, 5)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0_1.RULE, sc=88.89, cov=88.89, len=8, hilen=3, rlen=9, qreg=(88, 96), ireg=(0, 8)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0_11.RULE, sc=51.43, cov=57.14, len=4, hilen=2, rlen=7, qreg=(88, 91), ireg=(0, 3)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0_27.RULE, sc=6.67, cov=6.67, len=1, hilen=1, rlen=15, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0_27.RULE, sc=13.33, cov=13.33, len=2, hilen=1, rlen=15, qreg=(90, 91), ireg=(8, 9)
LicenseMatch: 'cc-by-sa-1.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-1.0_28.RULE, sc=14.29, cov=14.29, len=2, hilen=1, rlen=14, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0.RULE, sc=75.0, cov=75.0, len=6, hilen=2, rlen=8, qreg=(88, 93), ireg=(0, 5)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_10.RULE, sc=41.67, cov=41.67, len=5, hilen=2, rlen=12, qreg=(89, 93), ireg=(5, 9)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_11.RULE, sc=34.94, cov=35.29, len=6, hilen=2, rlen=17, qreg=(88, 93), ireg=(6, 11)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_2.RULE, sc=45.45, cov=45.45, len=5, hilen=2, rlen=11, qreg=(89, 93), ireg=(4, 8)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_25.RULE, sc=70.0, cov=77.78, len=7, hilen=3, rlen=9, qreg=(88, 96), ireg=(0, 8)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_26.RULE, sc=51.43, cov=57.14, len=4, hilen=2, rlen=7, qreg=(88, 91), ireg=(0, 3)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_4.RULE, sc=38.89, cov=38.89, len=7, hilen=3, rlen=18, qreg=(87, 93), ireg=(0, 14)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_47.RULE, sc=6.67, cov=6.67, len=1, hilen=1, rlen=15, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_47.RULE, sc=13.33, cov=13.33, len=2, hilen=1, rlen=15, qreg=(90, 91), ireg=(8, 9)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_48.RULE, sc=14.29, cov=14.29, len=2, hilen=1, rlen=14, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-sa-2.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-2.0_7.RULE, sc=50.0, cov=50.0, len=6, hilen=2, rlen=12, qreg=(88, 93), ireg=(4, 9)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_1.RULE, sc=100.0, cov=100.0, len=8, hilen=2, rlen=8, qreg=(88, 95), ireg=(0, 7)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_101.RULE, sc=14.29, cov=14.29, len=2, hilen=1, rlen=14, qreg=(90, 91), ireg=(7, 8)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_102.RULE, sc=6.67, cov=6.67, len=1, hilen=1, rlen=15, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_102.RULE, sc=13.33, cov=13.33, len=2, hilen=1, rlen=15, qreg=(90, 91), ireg=(8, 9)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_11.RULE, sc=40.91, cov=40.91, len=9, hilen=3, rlen=22, qreg=(87, 95), ireg=(10, 21)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_18.RULE, sc=56.25, cov=56.25, len=9, hilen=3, rlen=16, qreg=(87, 95), ireg=(0, 15)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_37.RULE, sc=38.1, cov=38.1, len=8, hilen=3, rlen=21, qreg=(87, 95), ireg=(12, 20)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_46.RULE, sc=69.23, cov=69.23, len=9, hilen=3, rlen=13, qreg=(87, 95), ireg=(3, 11)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_7.RULE, sc=100.0, cov=100.0, len=9, hilen=3, rlen=9, qreg=(88, 96), ireg=(0, 8)
LicenseMatch: 'cc-by-sa-3.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-3.0_9.RULE, sc=45.0, cov=45.0, len=9, hilen=3, rlen=20, qreg=(87, 95), ireg=(9, 19)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_104.RULE, sc=34.94, cov=35.29, len=6, hilen=3, rlen=17, qreg=(87, 93), ireg=(4, 14)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_28.RULE, sc=57.14, cov=57.14, len=8, hilen=4, rlen=14, qreg=(87, 96), ireg=(2, 13)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_48.RULE, sc=4.76, cov=4.76, len=1, hilen=1, rlen=21, qreg=(87, 87), ireg=(20, 20)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_48.RULE, sc=28.57, cov=28.57, len=6, hilen=2, rlen=21, qreg=(88, 93), ireg=(12, 17)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_50.RULE, sc=4.76, cov=4.76, len=1, hilen=1, rlen=21, qreg=(87, 87), ireg=(20, 20)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_50.RULE, sc=23.81, cov=23.81, len=5, hilen=2, rlen=21, qreg=(89, 93), ireg=(13, 17)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_58.RULE, sc=46.15, cov=46.15, len=6, hilen=2, rlen=13, qreg=(88, 93), ireg=(5, 10)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_59.RULE, sc=38.46, cov=38.46, len=5, hilen=2, rlen=13, qreg=(89, 93), ireg=(6, 10)
LicenseMatch: 'cc-by-sa-4.0', lines=(0, 0), matcher='3-seq', rid=cc-by-sa-4.0_and_url.RULE, sc=50.0, cov=50.0, len=7, hilen=3, rlen=14, qreg=(89, 96), ireg=(6, 13)
LicenseMatch: 'cc-by-sa-2.5', lines=(0, 0), matcher='3-seq', rid=maven_pom_32.RULE, sc=50.0, cov=50.0, len=7, hilen=3, rlen=14, qreg=(87, 93), ireg=(0, 11)
LicenseMatch: 'odc-by-1.0', lines=(0, 0), matcher='3-seq', rid=odc-by-1.0_13.RULE, sc=7.69, cov=7.69, len=1, hilen=1, rlen=13, qreg=(87, 87), ireg=(0, 0)
LicenseMatch: 'odc-by-1.0', lines=(0, 0), matcher='3-seq', rid=odc-by-1.0_13.RULE, sc=15.38, cov=15.38, len=2, hilen=1, rlen=13, qreg=(90, 91), ireg=(7, 8)

matches before final merge: 2
matches before final merge MATCHED TEXTS
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(43, 44), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
  MATCHED QUERY TEXT: Creative Commons license *
http://creativecommons.org/licenses/by-sa/3.0/legalcode
  MATCHED RULE TEXT: creative commons license <attribution> <noncommercial> <sharealike> <3> <0>
<unported> http creativecommons org licenses by <nc> sa 3 0 legalcode
LicenseMatch: 'gpl-2.0-plus', lines=(4, 4), matcher='1-spdx-id', rid=spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d, sc=100.0, cov=100.0, len=6, hilen=3, rlen=6, qreg=(5, 10), ireg=(0, 5)
  MATCHED QUERY TEXT: SPDX-License-Identifier: GPL-2.0+
  MATCHED RULE TEXT: spdx license identifier gpl 2 0+
final matches: 2
final matches MATCHED TEXTS
LicenseMatch: 'gpl-2.0-plus', lines=(4, 4), matcher='1-spdx-id', rid=spdx-license-identifier-gpl_2_0_plus-6a7800b229dd3f061ec9f8deeaf5fc1cd1310d8d, sc=100.0, cov=100.0, len=6, hilen=3, rlen=6, qreg=(5, 10), ireg=(0, 5)
  MATCHED QUERY TEXT: SPDX-License-Identifier: GPL-2.0+
  MATCHED RULE TEXT: spdx license identifier gpl 2 0+
LicenseMatch: 'cc-by-nc-sa-3.0', lines=(43, 44), matcher='3-seq', rid=cc-by-nc-sa-3.0_23.RULE, sc=54.55, cov=54.55, len=12, hilen=4, rlen=22, qreg=(85, 96), ireg=(3, 21)
  MATCHED QUERY TEXT: Creative Commons license *
http://creativecommons.org/licenses/by-sa/3.0/legalcode
  MATCHED RULE TEXT: creative commons license <attribution> <noncommercial> <sharealike> <3> <0>
<unported> http creativecommons org licenses by <nc> sa 3 0 legalcode
Scanned: ./drivers/video/console_truetype.c
Scanned: ./drivers/video/console_truetype.c
Filter scans...
 Filter scan: licenses...
 Filter scan: copyrights...
Save scan results...
 Save scan results as: json-pp...
Scanning done.
Summary:        licenses, copyrights with 1 process(es)
Errors count:   0
Scan Speed:     0.70 files/sec. 
Initial counts: 1 resource(s): 1 file(s) and 0 directorie(s) 
Final counts:   1 resource(s): 1 file(s) and 0 directorie(s) 
Timings:
  scan_start: 2024-03-21T171012.685738
  scan_end:   2024-03-21T171017.270382
  setup_scan:licenses: 3.16s
  setup: 3.16s
  scan: 1.42s
  total: 4.59s
Removing temporary files...done.

Is it necessary to clear some cache?

pombredanne commented 5 months ago

Is it necessary to clear some cache?

Yes, you need to regen the index with scancode-reindex-licenses . It used to be automated in the past but manual control is much simpler, because cache invalidation is not a simple thing to get right all the times.