doubleopen-project / policy-configuration

Double Open license classification for OSS Review Toolkit (ORT) and other uses.
Creative Commons Zero v1.0 Universal
12 stars 5 forks source link

Some license IDs seem wrong. How do they come into existence? #37

Open hoijui opened 1 year ago

hoijui commented 1 year ago

For example, I saw this (not a valid SPDX expressions): GPL-3.0-or-later-with-Autoconf-macro-exception Which probably should rather be this (a valid SPDX expressions): GPL-3.0-or-later WITH Autoconf-exception-macro

Is it because some project uses it wrong, and thus you have to index it in this wrong way too? Would it be possible to instead have an auto-conversion list (I imagine a CSV with two columns: before and after), Which could be applied on IDs when scanning, in order to not having to maintain multiple entries (-> duplicated data) in the licenses list for such cases? Such a CSV would additionally be useful, as it would highlight common errors, and could even form the data ... base, used to auto-create pull requests that would fix peoples SPDX license expressions.

willebra commented 1 year ago

I believe this is due to Fossology producing license hits with these names, and maintaining double licenses in this classification has been a quick fix way to still handle them in some cases where Fossology is needed as a scanner for an automated pipeline. We are reducing the use of Fossology and except to remove these once Fossology is no longer needed in that role.

In principle the CSV list would be a more robust way of managing this, but as we plan to get rid of them entirely, we focus on something else.