aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://aboutcode.org/scancode/
2.15k stars 553 forks source link

`elastic-license-2018` is detected instead of `MIT` #2275

Open fviernau opened 4 years ago

fviernau commented 4 years ago

Description

ScanCode detects elastic-license-2018 in these lines: https://github.com/felixge/node-delayed-stream/blame/v0.0.6/Readme.md#L139-L141.

How To Reproduce

scancode Readme.md -l --json-pp out.txt

System configuration

pombredanne commented 4 years ago

@fviernau thank you for the report! So for reference the file is at https://raw.githubusercontent.com/felixge/node-delayed-stream/07a9dc99fb8f1a488160026b9ad77493f766fb84/Readme.md And the scan with scancode -l --license-text --license-text-diagnostics --json-pp - yields this:

{
  "headers": [
    {
      "tool_name": "scancode-toolkit",
      "tool_version": "3.2.1rc2.post51.f3997e7.dirty.20201009134331",
      "options": {
        "input": [
          "Readme.md"
        ],
        "--json-pp": "-",
        "--license": true,
        "--license-text": true,
        "--license-text-diagnostics": true
      },
      "notice": "Generated with ScanCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nScanCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nScanCode is a free software code scanning tool from nexB Inc. and others.\nVisit https://github.com/nexB/scancode-toolkit/ for support and download.",
      "start_timestamp": "2020-10-13T171548.834606",
      "end_timestamp": "2020-10-13T171554.585346",
      "duration": 5.750752925872803,
      "message": null,
      "errors": [],
      "extra_data": {
        "files_count": 1
      }
    }
  ],
  "files": [
    {
      "path": "Readme.md",
      "type": "file",
      "licenses": [
        {
          "key": "elastic-license-2018",
          "score": 16.67,
          "name": "Elastic License Agreement 2018",
          "short_name": "Elastic License 2018",
          "category": "Source-available",
          "is_exception": false,
          "owner": "Elastic",
          "homepage_url": "https://github.com/elastic/elasticsearch/blob/0d8aa7527e242fbda9d84867ab8bc955758eebce/licenses/ELASTIC-LICENSE.txt",
          "text_url": "https://github.com/elastic/elasticsearch/blob/0d8aa7527e242fbda9d84867ab8bc955758eebce/licenses/ELASTIC-LICENSE.txt",
          "reference_url": "https://enterprise.dejacode.com/urn/urn:dje:license:elastic-license-2018",
          "spdx_license_key": null,
          "spdx_url": "",
          "start_line": 139,
          "end_line": 141,
          "matched_rule": {
            "identifier": "elastic-license-2018_1.RULE",
            "license_expression": "elastic-license-2018",
            "licenses": [
              "elastic-license-2018"
            ],
            "is_license_text": false,
            "is_license_notice": true,
            "is_license_reference": false,
            "is_license_tag": false,
            "matcher": "3-seq",
            "rule_length": 30,
            "matched_length": 5,
            "match_coverage": 16.67,
            "rule_relevance": 100.0
          },
          "matched_text": "License\n\n[delayed]-[stream] [is] licensed under the [MIT] license."
        },
        {
          "key": "mit",
          "score": 100.0,
          "name": "MIT License",
          "short_name": "MIT License",
          "category": "Permissive",
          "is_exception": false,
          "owner": "MIT",
          "homepage_url": "http://opensource.org/licenses/mit-license.php",
          "text_url": "http://opensource.org/licenses/mit-license.php",
          "reference_url": "https://enterprise.dejacode.com/urn/urn:dje:license:mit",
          "spdx_license_key": "MIT",
          "spdx_url": "https://spdx.org/licenses/MIT",
          "start_line": 141,
          "end_line": 141,
          "matched_rule": {
            "identifier": "mit_680.RULE",
            "license_expression": "mit",
            "licenses": [
              "mit"
            ],
            "is_license_text": false,
            "is_license_notice": true,
            "is_license_reference": false,
            "is_license_tag": false,
            "matcher": "2-aho",
            "rule_length": 6,
            "matched_length": 6,
            "match_coverage": 100.0,
            "rule_relevance": 100.0
          },
          "matched_text": "is licensed under the MIT license."
        }
      ],
      "license_expressions": [
        "elastic-license-2018",
        "mit"
      ],
      "percentage_of_license_text": 1.29,
      "scan_errors": []
    }
  ]
}

For this, I would set a minimum_coverage: 50 or 60 in the elastic-license-2018_1.yml data file

And I would add a new rule is_license_notice: yes and relevance: 100 with this text: License ... is licensed under the MIT license.

@AyanSinhaMahapatra another case where your new tool may help?