Open broeder-j opened 2 years ago
Yes, this is indeed an issue, the same goes for CONTRIBUTORS and MAINTAINERS. Codemetapy only supports a certain simple list format that is used often (see https://github.com/proycon/codemetapy/blob/master/codemeta/parsers/authors.py), and tries to be fairly flexible.
I'd rather not turn it off, but I think we need some extra validation and ignore AUTHORS/CONTRIBUTORS/MAINTAINERS file that are too different from what we expect.
This solution would be also fine of course.
The AUTHORS file is currently parsed by codemeta-harvester. However this file is pretty much free format. people write free text in there, lists look completely different, so I think there is now way one can parse names from this file reliable.
Currently one ends up with all kinds of wrong authors like:
this needs to be done smarter. Currently I have no good idea except so switch it off and get the authors from the Citation file and then git history per default. Maybe exclude people whose Names are not found somewhere in a given AUTHORS file, to sort out some strange contributors from the git history.