CatalogueOfLife / xcol

Working towards the extended Catalogue of Life Checklist
0 stars 0 forks source link

bird given as plant #168

Open mdoering opened 2 months ago

mdoering commented 2 months ago

the source has parent id invalid problems and the bird homonym Oenanthe ends up in plants:

https://www.checklistbank.org/dataset/53133/taxon/4636

therefore also in xcol: https://www.checklistbank.org/dataset/301711/taxon/DRBPY

mdoering commented 2 months ago

there are 9 merged Oenanthe plant names with a year given in the authorship which are likely birds: https://www.checklistbank.org/dataset/301711/names?TAXON_ID=679Q&facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nomCode&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&facet=sectorMode&facet=secondarySourceGroup&facet=sectorDatasetKey&facet=group&field=basionym%20year&field=combination%20year&limit=50&offset=0

mdoering commented 2 months ago

these errors are pretty bad. Much worse than duplicates!

mdoering commented 2 months ago

For all of plants there are thousands names with year given in the authorship. Some of them seem legit, especially the higher ones. And algae related ones also seem mostly correct, but we should investigate and search for error patterns:

https://www.checklistbank.org/dataset/301711/names?TAXON_ID=P&facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nomCode&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&facet=sectorMode&facet=secondarySourceGroup&facet=sectorDatasetKey&facet=group&field=combination%20year&field=basionym%20year&limit=50&offset=200&rank=species&rank=genus&rank=variety&rank=form&rank=subspecies&reverse=false&sortBy=taxonomic

camiplata commented 2 months ago

I'll create a sector by Kingdom to check if the Belgian Species List gets a better merge

mdoering commented 2 months ago

If the source gives a wrong classification that won't help I think. But I could try to code for sth like that. If both a target and subject is given in a sector only use those and ignore the source classification of the subject itself. But are we sure all Oenanthes are actually birds in this list or is it a mix of both?

camiplata commented 2 months ago

I'm doing a detail review of the names merged from this source to determine if is worth keeping it, I'll keep you posted

mdoering commented 2 months ago

we should also see if can fix the source. At least tell them about the problem. I believe the problem comes from bad parent identifiers.

camiplata commented 2 months ago

The 6 names from the belgiam lists are all birds, the names are flagged with parent ID invalid (The value for dwc:parentNameUsageID could not be resolved or is missing.) But the ID does reference to the correct bird genus within the list. But the genus appears as a bare name probably due to a wrongly documented taxonomic status. could it be the bare name genus the reason why the species gets flagged with parent ID invalid?

Verbatim data from a bird name merged as a plant: https://www.checklistbank.org/dataset/53133/taxon/26532

Captura de pantalla 2024-09-06 a la(s) 12 31 29 p m

Verbatim data from the genus referenced by dwc:parentNameUsageID

Captura de pantalla 2024-09-06 a la(s) 12 33 01 p m

The 3 remaining names (synonyms) come from TAXREF and are plants. But there is at least one typo. Oenanthe longifoliata should be Oenanthe longifoliolata Schischk.. but thats another story