oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.59k stars 309 forks source link

Path based license conclusions for projects #1726

Closed fviernau closed 4 years ago

fviernau commented 5 years ago

Rule violations triggered by incorrect license detections can currently be addressed as follows

For projects:

  1. create a rule violation resolution

For packages (project dependencies):

  1. create a rule violation resolution
  2. create a concluded license

Problems

Proposed feature

Below example shows a curation for all license findings of all .cpp and .h files found underneath src in lines 5-18 where CDDL-1.1 was detected. It corrects all the matched findings with CDDL-1.0

curations:
   paths:
    pattern: "/src/**{.cpp|.h}"
    start_line: 5
    line_count: 13
    detected_license: "CDDL-1.1"
    reason: MATCH_INCORRECT
        comment: "the license version actually is 1.0"
        concluded_license: "CDDL-1.0

Spec:

Advantages

Scope While this would be useful for dependencies as well limit the scope to local configurations aka ort.yml as this has highest priority currently.

Related see https://github.com/heremaps/oss-review-toolkit/issues/1130

fviernau commented 5 years ago

how about using those ort.yml properties:

curations:
   license_findings:

..and call it license finding curation

moreover, how about using list as type for start lines?

fviernau commented 5 years ago

My attempts to implemented this resulted in one bigger and one small potentiall refactoring topic:

  1. The association between license findings and copyright statements currently made in scanCode and then lost as not represented in the model (ScanSummary.kt)
  2. Whether the application of the curations results in a new LicenseView e.g. which would satisfy a refactoring to generalize the implementation (which imo would already make sense anyway due to redundancy, as indicator just see the almost identical test cases which are heavily redundant).

Association between license findings and copyrights

  1. The probably simplest approach (and imo only approach which keeps this ticket reasonably small) would be to recompute the association between license findings and copyright, e.g. in a function similar to OrtResult.collectLicenseFindings
  2. Do a minimal invasive change to the model so that the relation between license findings and copyright findings is represented.
  3. Don't make an association of the findings in the Scanner / ScanResult at all and do this on the fly instead via adding a function to OrtResult. On the go eliminate grouped finding classes from the model replacing it with classes representing a single finding.
    • current representation:
      "license_findings" : [ {
            "license" : "ISC",
            "locations" : [ {
              "path" : "LICENSE",
              "start_line" : 1,
              "end_line" : 15
            } ],
            "copyrights" : [ {
              "statement" : "Copyright (c) Isaac Z. Schlueter and Contributors",
              "locations" : [ {
                "path" : "LICENSE",
                "start_line" : 3,
                "end_line" : 3
              } ]
            } ]
          } ]
    • proposal (separate licenses and copyrights + singular instead of plural location(s))
      "licenses" : [
      {
            "license" : "ISC",
            "location" : [ {
              "path" : "LICENSE",
              "start_line" : 1,
              "end_line" : 15
      } ],
      "copyrights" : [ 
      {
              "statement" : "Copyright (c) Isaac Z. Schlueter and Contributors",
              "location" : [ {
                "path" : "LICENSE",
                "start_line" : 3,
                "end_line" : 3
      }
      ]

Rationale:

New LicenseView

generalization:

LicenseView(getAllConcludedLicenses(), getAllDetectedLicenses())
LicenseView(getAllConcludedLicenses(), getAllDetectedLicenses() + getAllDeclaredLicenses())
fviernau commented 5 years ago

The very basic feature is implemented and of course already usable. This MVP for now applies the curations at a single place and uses curated detected licenses instead of raw detected licenses all over the place, e.g. Reporters/UI and rules/evaluator . What is planned still is

Here is how the basic feature was implemented:

https://github.com/heremaps/oss-review-toolkit/pull/1856 https://github.com/heremaps/oss-review-toolkit/pull/1846 https://github.com/heremaps/oss-review-toolkit/pull/1844 https://github.com/heremaps/oss-review-toolkit/pull/1841 https://github.com/heremaps/oss-review-toolkit/pull/1813 https://github.com/heremaps/oss-review-toolkit/pull/1812 https://github.com/heremaps/oss-review-toolkit/pull/1795 https://github.com/heremaps/oss-review-toolkit/pull/1792 https://github.com/heremaps/oss-review-toolkit/pull/1763 https://github.com/heremaps/oss-review-toolkit/pull/1762 https://github.com/heremaps/oss-review-toolkit/pull/1759 https://github.com/heremaps/oss-review-toolkit/pull/1758 https://github.com/heremaps/oss-review-toolkit/pull/1756 https://github.com/heremaps/oss-review-toolkit/pull/1750 https://github.com/heremaps/oss-review-toolkit/pull/1744 https://github.com/heremaps/oss-review-toolkit/pull/1739 https://github.com/heremaps/oss-review-toolkit/pull/1736

fviernau commented 4 years ago

Closing this ticket as the basic feature has been merged to master. Identified next steps can be: