aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://github.com/aboutcode-org/scancode-toolkit/releases/
2.07k stars 536 forks source link

Improve SPDX license identifiers handling #2007

Closed pombredanne closed 2 years ago

pombredanne commented 4 years ago

There are some non-standard use of Licence (English spelling is Licence for the noun and License for the verb) https://github.com/BramvanPelt/PPDecryption/blob/6de599af84cb4dc706df6e448c552bd098ea47d7/PP-Decrypt/src/nl/logius/resource/pp/key/PseudonymClosingKey.java#L3

There are also some corner cases where a rule with a text of such as: SPDX license identifier: BSD-3-Clause ... and the scanned text contains this: SPDX license identifier: BSD-3-Clause-No-Nuclear-Warranty ... we will end up having two matches instead of one:

  1. an exact match to the regular rule SPDX license identifier: BSD-3-Clause
  2. an SPDX license id match to BSD-3-Clause-No-Nuclear-Warranty

We should only get one match that is the second match.

pombredanne commented 2 years ago

This has been completed.