Open yroskov opened 7 months ago
It looks like, the issue of handling [sic]names needs coordination with @gdower.
For the vast majority of synonyms this seems reasonable to do. But there are also a few accepted names with [sic] - what to do with these?
Interestingly the first accepted one I checked is a worms genus, which claims to be a misspelling on the worms site: https://www.molluscabase.org/aphia.php?p=taxdetails&id=536363
there are also a few accepted names with [sic] - what to do with these?
I would give "provisionally accepted" status to all accepted names with [sic] portion.
When interpreting names we actually remove sic already and keep a flag originalSpelling instead on the name: https://github.com/CatalogueOfLife/backend/issues/501
This is then being rendered again into [sic] in the label again.
The examples you had given dont do that though, but have [sic] as part of their authorship instead: https://api.checklistbank.org/dataset/286246/name/Z0ZOIgGBCF8_D_s4WIeCc
As this is from a january release and WoRMS datasets are updated pretty much veery every month it must still interpret things wrongly then. Not so in tests, difficult to reproduce.
Ah, I can reproduce it when sic is supplied as authorship and not the scientificName!
The name interpreter should now look for sic and corrig statements inside the authorship too. That means the original flag is populated and sic shown in the label, but it should not be considered an epithet or be part of the authorship string.
The CoL has at present 2,588 names with a comment [sic]: https://www.catalogueoflife.org/data/search?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=extinct&facet=environment&limit=50&offset=0&q=sic&sortBy=taxonomic
Latin comment sic (so, yes!) is used widely in Zoology to indicate misspelled name.
Quite often, species names with portion [sic] in CLB are recognized as trinomials (subspecies as quadrinomials (with appropriate cuttings)). All of these creates problems.
It would be nice, if CLB automatically remove portion [sic] from the names and give them a name status "orthographic variant" or "misspelling".