aboutcode-org / scancode-toolkit

:mag: ScanCode detects licenses, copyrights, dependencies by "scanning code" ... to discover and inventory open source and third-party packages used in your code. Sponsored by NLnet project https://nlnet.nl/project/vulnerabilitydatabase, the Google Summer of Code, Azure credits, nexB and others generous sponsors!
https://aboutcode.org/scancode/
2.11k stars 546 forks source link

We should assign a category to a license-expression to clarify `license WITH exception` cases #2897

Open pombredanne opened 2 years ago

pombredanne commented 2 years ago

For instance GPL-2.0 WITH Classpath-exception 2.0 should be treated as Copyleft-limited and not as Copyleft

This was reported by @DennisClark and it could be that all exceptions to a Copyleft license turn a license expression in a Copyleft-limited

DennisClark commented 2 years ago

this needs more detail to explain the issue and what ought to be done about it.

DennisClark commented 2 years ago

basically we are talking about applying a category to a license_expression, so that the category of the Exception influences the category of the expression, thus simplifying and improving the review process.

DennisClark commented 1 year ago

consider as a general principle that the category of the exception overrides the category of the license to which it is applied

DennisClark commented 1 year ago

however, that assumes that the categories of the exceptions are correct -- should be reviewed. and we need to consider that a License Expression has its own Category; in other words, what is the prevailing category for an expression for example mit or gpl is permissive

let's call it expression-category The license category that is actually effective when a software object has a complex license expression.

DennisClark commented 1 year ago

perhaps this new field belongs on the package model

DennisClark commented 1 year ago

better yet, let's call it license-expression-category

sschuberth commented 9 months ago

For instance GPL-2.0 WITH Classpath-exception 2.0 should be treated as Copyleft-limited and not as Copyleft

Ping FYI, @willebra.

willebra commented 9 months ago

Some initial thoughts. Treating of license-exceptions in automatic ways should be improved, and categorizations are needed for that.

As background, we apply a more granular license categorization, which has e.g. the following categories: permissive, copyleft-file-level, copyleft-module-level, copyleft-lgpl and copyleft-strong. See: https://github.com/doubleopen-project/policy-configuration/blob/main/license-classifications.yml . In addition we have properties (one property being Autoconf-or-other-exception).

Exceptions typically have two moving parts impacting the effect of the exception. And here by the effect I refer to that part of the effect that we could/might treat automatically. These moving parts would then need to be recorded into a license classification in some manner. In our case we would aim to use categories and properties.

These moving parts are: 1) the criteria when does the exception apply. E.g. with respect to autoconf exception 2.0, it applies to the "configure scripts that are the output of Autoconf". There are some limitations to this criteria re modified versions. The criteria varies. E.g. Bison exception 2.2 applies to the Bison project and requires that the distributed software is not a parser generator. Further a Classpath exception 2.0 applies without criteria, whenever it is attached to a license. 2) what is the effect of the exception. With respect to autoconf exception 2.0 it the exception should likely be interpreted to convert the license into a permissive license with no additional requirements. Bison exception 2.2 also renders the license permissive. On the other hand, Classpath exception 2.0 would render an otherwise copyleft-strong license into a copyleft-module-level license.

So we need a way to record these types of differences in the license categorization. Perhaps we could start with a simpler categorization example and then later develop it into more granular.

Currently we are only using a generic exception property and all of such hits are manually verified.

An improvement could be that we record whether an exception has a criteria which should be manually checked (Bison), or whether it always applies (Classpath). In our example this could be "property:exception-with-criteria", "property:exception-without-criteria", and then in rules we would always give alerts on those with criteria. Looks a bit dirty, but my idea is communicated.

And then we could give exceptions a category, which would reflect the category of the license, assuming that exception criteria is fulfilled or there is no exception criteria. So whenever you run into a license with an exception that has a "property-exception-without-criteria" you can just apply the category of the exception.

On the other hand, when you run into an exception with a criteria, you need to manually verify the application of the criteria. The next step would be to collect the ~10 most used criterias and choose those you can reasonably automate and then make properties out of those.

Happy to hear thoughts

DennisClark commented 9 months ago

I think it's important to clarify that the scope of this improvement is limited to "license WITH exception" cases and not more complex license expressions that express multiple licenses connected by the "AND" operator; that is, the "(license WITH exception)", ideally surrounded by parentheses, can be thought of as its own unit (a molecule?) and we can apply a category to that. Since the most common cases exist with the general rule that the category of the exception prevails over the category of the target license, we can make that the default behavior, but ultimately this should be controlled by detection rules to handle odd cases where that is not what is actually happening, for example, "exceptions" that simply tell you what you are allowed to do but don't really modify the target license terms.