oss-review-toolkit / ort-config

Curations and configuration files for the OSS Review Toolkit.
Apache License 2.0
15 stars 15 forks source link

license-classifications.yml contains duplicate id Unlicense and LicenseRef-scancode-unlicense #217

Open mawl opened 1 month ago

mawl commented 1 month ago

During upgrade of our own license-classifications.yml I have noticed that there are two entries for maybe the same license in your license-classifications.yml:

LicenseRef-scancode-unlicense can't be found via web ui in scancode-licensedb and isn't used in one of your curations.

I haven't proved that for other ids - the only thing remarkable is, that the lines of file have increased from ~7400 to ~11500 since we last have updated around Feb, 2023.

sschuberth commented 1 month ago

License classifications are auto-generated by now, by importing ScanCode's LicenseDB as last happened ~3 weeks ago. While that maybe explains the increase in classifications, it does not explain why we have the Unlicense twice. Maybe @fviernau, who wrote the ScanCodeLicenseDbClassifications code, has an idea?

fviernau commented 1 month ago

I'll need to look into this in detail later.

As a general remark, I believe we should have one classification per distinct license identifier. So, if there are tow IDs for the same license, two classifications are needed.

mawl commented 1 month ago

@fviernau: LicenseRef-scancode-unlicense has no match in the web UI and the json either, so it seems that this classification is obsolete.