clearlydefined / crawler

A service that crawls projects and packages for information relevant to ClearlyDefined
MIT License
43 stars 30 forks source link

License not correctly picked up for source archive type of packages #533

Closed qtomlinson closed 5 months ago

qtomlinson commented 6 months ago

Test case: https://clearlydefined.io/definitions/sourcearchive/mavencentral/org.apache.httpcomponents/httpcore/4.1

Expected: Declared license to be Apache-2.0

Acutal: Declared license is empty

httpcore-4.1-sources.jar is available at https://repo1.maven.org/maven2/org/apache/httpcomponents/httpcore/4.1 META-INF/LICENSE.txt shows Apache License Version 2.0.

In the persisted harvested data, Licensee and ScanCode detected Apache-2.0 This seems to be lost after processed by the summarizers.

qtomlinson commented 6 months ago

image

qtomlinson commented 5 months ago

Fix has been merged and verified on dev. image