jpeddicord / askalono

A tool & library to detect open source licenses from texts
Apache License 2.0
255 stars 25 forks source link

'GPL-3.0-only' has 'GPL-3.0-or-later' as an alias #45

Closed phrohdoh closed 4 years ago

phrohdoh commented 5 years ago

I ran askalono identify --optimize path/to/file.rs and got the following.

License: Unknown
Score: 0.221
Containing:
  License: GPL-3.0-only (license header)
  Score: 0.946
  Lines: 4 - 15
  Aliases: GPL-3.0-or-later

Using GPL-3.0-only is different from GPL-3.0-or-later, why are they aliases of each other?

Is this something askalono controls/determines or are aliases driven by SPDX data?

jpeddicord commented 5 years ago

These aliases are indeed driven by SPDX data.

Both of these have the same text, so askalono dedupes them into one. Previously it would return somewhat confusing results (you might get one or the other sometimes) so I chose to merge them and return both.

With headers in the mix things can get kind of complex. The stored header data in the SPDX repository sometimes has SPDX templates, and sometimes does not. Either way, askalono doesn't (yet?) know how to parse those. Open to suggestions.

jpeddicord commented 4 years ago

Closing this issue as this is specific to how SPDX data is represented, but I'm totally open to some solutions (clever or not) to help make this more clear.