clearlydefined / service

The service side of clearlydefined.io
MIT License
45 stars 40 forks source link

Failure calling curation patch API with certain license expressions #1181

Open qtomlinson opened 3 weeks ago

qtomlinson commented 3 weeks ago

Expected: should be able to call patch /curate with payload include valid license expressions

Observed: Error message: {"errors": [[{"message": "Invalid license in curation","error": "2.25.1 licensed.declared with value \"GPL-2.0-Only AND GPL-3.0-Or-Later AND LGPL-2.1-Only AND MPL-1.1\" is not SPDX compliant"}]],"patchesInError": [{"coordinates": {"type": "maven","provider": "mavencentral","namespace": "org.glassfish.jersey.core","name": "jersey-client"},"revisions": {"2.25.1": {"licensed": {"declared": "GPL-2.0-Only AND GPL-3.0-Or-Later AND LGPL-2.1-Only AND MPL-1.1"}}}}]}

Steps to reproduce: curl -X PATCH "https://dev-api.clearlydefined.io/curations" -H "accept: application/json" -H "Content-Type: application/json" -d "{\"contributionInfo\":{\"summary\":\"test\",\"details\":\"test details\",\"resolution\":\"test resolution\",\"type\":\"Missing\"},\"patches\":[{\"coordinates\":{\"type\":\"maven\",\"provider\":\"mavencentral\",\"namespace\":\"org.glassfish.jersey.core\",\"name\":\"jersey-client\"},\"revisions\":{\"2.25.1\":{\"licensed\":{\"declared\":\"GPL-2.0-Only AND GPL-3.0-Or-Later AND LGPL-2.1-Only AND MPL-1.1\"}}}}]}"

yashkohli88 commented 3 weeks ago

I tried debugging the curations API and found that the license name should be exactly same as the identifier present in the spdx list as service is using the spdx parser to identify the license.

Then one more thing to take care of while curating complex license expression is to add brackets in some of the expressions. One of such case is described below.

Below is the license expression used to curate these three licenses successfully. Here is the PR for the curation which I tried on my local system and personal repo - https://github.com/yashkohli88/curated-data-dev/pull/4 License expression - GPL-2.0-only OR (LGPL-2.1-only OR MPL-1.1)

The difference in the parsed license expression and the input license expression is at the Line 79

Below image with breakpoint on the same line confirms.

image

The normalize() function is imported from normalize()

The dependency that is responsible for this difference is the parse function inside spdx-expression-parse

qtomlinson commented 3 weeks ago

Found a couple issues:

  1. SPDX.stringify (called by SPDX.normalize)

  2. != SPDX.normalized

    • Question 1, contains NOASSERTION, should not be allowed to curate.
    • Question 2,