oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.57k stars 307 forks source link

Add the concept of "trusted" vendors / frameworks / SDKs #5105

Open sschuberth opened 2 years ago

sschuberth commented 2 years ago

When scanning code for e.g. Android or iOS apps, there are some reoccurring fundamental SDK dependencies which usually contains a lot of source code. It makes little sense to scan these dependencies over and over again for (small) version changes.

To address the issue, I believe it's fair to assume that if a company decided to develop e.g. Android / iOS apps, the (business) decision to accept the use of the required SDKs has effectively already been made. With that in mind, it would be nice if ORT had a concept of "trusted vendors" where a copyright holder and license need to be specified for whole SDK / framework, and then scanning for all belonging / matching packages would be skipped.

This is conceptually similar to the existing mechanism of curating authors and concludedLicense in conjunction with enabling skipConcluded. A reason why this existing mechanism is not sufficient is that curations can only use ranges / wildcards as part of the version. However, in order to trust e.g. the Amazon AWS SDK as a whole, which includes a component like Maven:com.amazonaws:aws-java-sdk-models:1.12.162, we'd also need to allow wildcards for the name (at least), so that authors and concludedLicense could be curated for something like Maven:com.amazonaws:aws-*:*.

So a proposal to support the concept of "trusted" vendors / frameworks / SDKs would be to extend the curation logic to also accept wildcards / regexes for the name.

sschuberth commented 2 years ago

Another proposal just came to my mind: "Trusted" in this context should mean that we trust the declared license, and thus also do not need to scan to get the detected licenses, even if no explicit concluded license is present.

sschuberth commented 2 years ago

It has slipped my mind that we allow the name of a package curation to be empty, so we can conclude the license (and set authors) for whole namespaces like Maven:com.amazonaws or Maven:com.android.tools, and then leverage skipConcluded to express "trust".

So, it that mechanism maybe already good enough?

Note that this would still require separate curations for e.g. Maven:com.google.android.datatransport and Maven:com.google.android.gms etc., and we could not say Maven:com.google.android.* (or similar).

sschuberth commented 1 year ago

So, it that mechanism maybe already good enough?

I believe yes. Also see https://github.com/oss-review-toolkit/ort/pull/5971.

sschuberth commented 1 year ago

So, it that mechanism maybe already good enough?

I believe yes. Also see #5971.

Turns out that curations with empty name / version are not enough to fully support this use-case, so I'm reopening this.

A problem arises e.g. with NuGet packages which generally don't have a namespace as part of their id (same for Go, Cargo, Conan, and more). See for example this list of ids:

The actual goal would be to conclude the license and set the author for all "System" packages. Having individual curations, even with empty versions, is not feasible as there are a lot of these packages. Also, leaving both the name and version empty is no option, as that would result in NuGet::: which would match all NuGet ids, also non-"System" ones.

So we would either need to