fsfe / reuse-tool

reuse is a tool for compliance with the REUSE recommendations.
https://reuse.software
386 stars 147 forks source link

Detect wrong and provide suggestions for wrong identifiers in addheader #364

Open hoijui opened 3 years ago

hoijui commented 3 years ago

When a BAD license is detected, suggest similarly named, valid licenses, if any. For example:

hoijui commented 3 years ago

Later I noticed, That there are suggestions like that (I got one when trying a "BSD" license). I guess it could be improved still - see my example above: CC0 does not result in a suggestion.

nicorikken commented 3 years ago

I agree, similarly there is the distinction between GPL-3.0 and GPL-3.0-or-later. If you naively type GPL-3.0 you might be registering the wrong license. Considering this is all user-input, some validation would be nice. I'm pretty sure there is some Python fuzzy text matching library available to find similar identifiers to ask the user to confirm.

siiptuo commented 2 years ago

This was implemented in #152. Originally the suggestions were better but adding a dependency for the fuzzy string matching library was not possible, so the functionality was simplified. Currently the suggestions work well only on typos such as GLP-3.0-ro-ltaer.

I agree that the suggestion should be improved. Is there any fuzzy string matching library we could use? Otherwise, the current solution could be modified to match prefixes so that inputting GPL-3.0 would suggest GPL-3.0-only and GPL-3.0-or-later for instance.

mxmehl commented 2 years ago

Oh, I almost forgot the feature you've once added, @siiptuo. Wouldn't that already be a great start for a check in addheader, especially after the merge of #416?

mxmehl commented 2 years ago

I renamed this issue to track the effort of adding the functionality that we've had since #416 also for addheader. Currently there is no sanity check at all.