Open aoern opened 3 years ago
It looks like not all source datasets have been reimported and resynced since we changed the code to keep the exact verbatim authorship. The given example Duguetia ruboides from AnnoBase is still from 2019: https://api.catalogue.life/dataset/1040/taxon/t48022
You can see the author is still badly parsed, but the important authorship string does not change: http://api.catalogue.life/parser/name?name=Abies&authorship=Maas%20%26%20He
The 2 letter author error came into existance because of this: https://github.com/gbif/name-parser/issues/28 Will make sure ampersands are excluded and dots are required for the gbif/name-parser#28 patch to apply
@yroskov @gdower There are 202 erroneous authorship strings in Sep 1 edition (DwC) due to misparsing of two-letter author names. Some examples:
Duguetia ruboides H.e.Maas in AnnonBase (should be Maas & He) Alara improba W.u.Yang, 1993 in FLOW (should be Yang & Wu) Acrolithus brevis M.a.Freytag, 1988 in MOWD (should be Freytag & Ma)
The complete list is here: CoLTwoLetterErrors.xlsx