fsfe / reuse-tool

reuse is a tool for compliance with the REUSE recommendations.
https://reuse.software
385 stars 147 forks source link

Add licenses from scancode license list #526

Open Blackclaws opened 2 years ago

Blackclaws commented 2 years ago

There are a lot of additional licenses out there that aren't part of the standard SPDX license list. There are third parties such as: https://scancode-licensedb.aboutcode.org/index.html (or on github: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses) that provide a host of additional licenses that are used in various (usually not quite so open) projects.

Might it make sense to add a function to also download from that list to reuse download?

mxmehl commented 2 years ago

Pulling in @pombredanne. Philippe, would that be something that you would also like to see?

It reminds me a bit of the license namespace discussion in SPDX that may be a motivation for REUSE to search for license text files in other, well-defined places. On the other hand, it would complicate things in the REUSE tool internals as the data formats are different, and we'd have to deal with licenses that are present in both SPDX and scancode.

pombredanne commented 2 years ago

@mxmehl @Blackclaws actually the license-expression library that you use in in REUSE not only has bundled in all the license ids from SPDX but also these from ScanCode ... See https://github.com/nexB/license-expression/tree/main/src/license_expression/data

It also contains other well known licenseref such as https://scancode-licensedb.aboutcode.org/kde-accepted-lgpl.html or https://scancode-licensedb.aboutcode.org/kde-accepted-gpl.html that are commonly see in KDE (and likely also in QT now based on https://www.qt.io/blog/switching-to-spdx :+1: :heart: ;) )

This data was added to license-expression by @JonoYang

With this you can effectively fetch full texts from https://scancode-licensedb.aboutcode.org/ or just validate expressions and keys without fetching anything.

pombredanne commented 2 years ago

Related ... https://scancode-licensedb.aboutcode.org/ is a guaranteed to be stable URL but we will eventually migrate to https://licensedb.org that was kindly donated by @warpr :heart_eyes: and @Blackclaws thank you for the kind words ... this means a lot! :bow:

floriansnow commented 2 years ago

To me this is also a question of whether REUSE is meant to be a tool to mainly get Free Software licensing right or a tool to get any kind of licensing right. The name and the main author imply a focus on Free Software, so I'm not sure if we need support for non-free license expressions. Or did I misunderstand the suggestion?

Blackclaws commented 2 years ago

I think reuse is a tool meant to get any type of software licensing right. Restricting it to Free Software is an unnecessary limitation I think. A lot of times there is also simply the need to combine proprietary with free licenses and properly keep track of the license obligations for code that is being used in multiple projects.

The end goal of internal projects can also be an eventual release under a Free License however for that to be possible you also need to properly track which proprietary sources code might have come from.

mxmehl commented 2 years ago

The scancode license corpus is much larger than SPDX's, and probably contains Free Software licenses that SPDX's does not. Therefore, it indeed could make sense to support these identifiers as otherwise you'd have to make a license that's only in Scancode's a LicenseRef-xyz instead of its "native" identifier.

That said, also SPDX does contain proprietary licenses, e.g. CC-BY-NC

seabass-labrax commented 2 years ago

We're having a special joint SPDX Legal/Tech team meeting tomorrow about the license namespace proposal - it's a high priority to reach a conclusion this time so please do join :grinning:

Blackclaws commented 1 year ago

Pinging everyone: Has any conclusion been reached on this topic?

pombredanne commented 1 year ago

@Blackclaws on my side I have no objections whatseover, quite the opposite and I would like to help.

Note a few things:

  1. The discussion on licenses namespace at SPDX has stalled. We are the primary users for a license namespace anyway, but there are always folks --typically less involved or concerned -- that are coming up with various objections. As far we are concerned we provide strong guarantees of stability and a stable curation process for ScanCode licenses anyway as we have been for a very long time. The ScanCode licenses that come with an SPDX id of LicenseRef-scancode-xxx prefix are the ones that live in our namespace so far. There are roughly 1400 licenses that do not exists in the SPDX list, all of them seen in the wild and of practical value.

  2. There is an upcoming PR by @AyanSinhaMahapatra in ScanCode that makes it easy to get the structured data for all the licenses the same way it is published in https://scancode-licensedb.aboutcode.org/ ... See https://github.com/nexB/scancode-toolkit/pull/3100