Open ariel11 opened 3 years ago
Here's another example where the license info on the setup.py was not detected
setup.py - https://clearlydefined.io/file/bdc6a2cd7c7d2450efacf5b0f3124118e7f8de6671bde4ae7d4ae2654a270858. https://clearlydefined.io/definitions/pypi/pypi/-/astroid/2.3.3.
Another example where license info on files is not being detected. Adding @pombredanne for thoughts on scancode.
https://clearlydefined.io/definitions/nuget/nuget/-/CsvHelper/2.16.0
License info on file in src folder
License info not showing up on files
FYI @nellshamrell
Queued up a pypi/pypi/-/asteroid harvest in my local environment - looks like it's encountering an error:
service_1 | GET /definitions/pypi/pypi/-/astroid/2.3.3?expand=prs&matchCasing=false 200 605.385 ms - 471
crawler_1 | POST /requests 201 0.910 ms - 7
crawler_1 | [I] Traversed component@cd:/pypi/pypi/-/astroid/2.3.3 {"loopName":"0","cid":"9o","root":"self","outcome":"Traversed","time":2,"crawlerId":"da62d530-3d93-4de8-936b-098a9b84bf2e","buildNumber":"0"}
crawler_1 | [I] Traversed package@cd:/pypi/pypi/-/astroid/2.3.3 {"loopName":"0","cid":"9p","root":"component@cd:/pypi/pypi/-/astroid/2.3.3","outcome":"Traversed","time":0,"crawlerId":"da62d530-3d93-4de8-936b-098a9b84bf2e","buildNumber":"0"}
crawler_1 | SourceDiscovery provider could not be found for https://pypi.org/project/astroid/
crawler_1 | [I] Processed pypi@cd:/pypi/pypi/-/astroid/2.3.3 {"loopName":"0","cid":"9q","root":"component@cd:/pypi/pypi/-/astroid/2.3.3","k":1657,"count":228,"write":7,"outcome":"Processed","time":2075,"crawlerId":"da62d530-3d93-4de8-936b-098a9b84bf2e","buildNumber":"0"}
crawler_1 | [I] Processed licensee@cd:/pypi/pypi/-/astroid/2.3.3 {"loopName":"0","cid":"9r","root":"component@cd:/pypi/pypi/-/astroid/2.3.3","write":2,"outcome":"Processed","time":3086,"crawlerId":"da62d530-3d93-4de8-936b-098a9b84bf2e","buildNumber":"0"}
crawler_1 | [I] Analyzing scancode@cd:/pypi/pypi/-/astroid/2.3.3 using ScanCode. input: /tmp/cd-lXAshv output: /tmp/cd-nviXNU {"crawlerId":"da62d530-3d93-4de8-936b-098a9b84bf2e","buildNumber":"0"}
I didn't get that error when running a local harvest of nuget/nuget/-/CsvHelper/2.16.0. However, I was able to replicate the behavior @ariel11 was seeing. Declaring this reproducible.
Here's an example of a file with the MIT license (https://clearlydefined.io/file/2893c762b244363021756d2bfa004c1402eb641a7b02a4cea8bd37ebddcf68c1) where ClearllyDefined did not flag it as a file with license info and therefore, did not include "MIT" in the "discovered" field.
Unclear why scancode wasn't run on this?
Could also get license info by following "Source" link to commit - LICENSE file = MIT.