CatalogueOfLife / data

Repository for COL content
7 stars 2 forks source link

Bad ranks in World Plants #438

Open mdoering opened 2 years ago

mdoering commented 2 years ago

In https://github.com/CatalogueOfLife/checklistbank/issues/1045 it turns our we have some phyla in World Plants as synonyms which should probably be genera. We will need to either update the archives or reimport & sync the dataset once the importer code is updated (likely next bug fixing monday).


The verbatim data from WorldPlants does not indicate any rank. I assume the phylum ending triggered the false interpretation. @gdower @yroskov I assume there is no chance to derive a rank for WP synonyms? We only infer the rank for names when no rank is given - but it can lead to false results in some cases as you can see here. Alternatively to looking at the name ending I might restrict the rank inferal to just UNRANKED, SPECIES or INFRASPECIFIC_NAME for uni/bi/trinomials

yroskov commented 2 years ago

@gdower, does something depends on us in this case? (WP has no ranks in synonymy. Usually, rank of the synonym is equal to the accepted taxon. Crawler may use this assumption. However, we'll generate essences, which are not present is the source. Should we do it? or software should learn how to handle unranked synonyms in a right way?). @gdower, let's have a talk about it.

mdoering commented 2 years ago

I could teach the importer to change synonymy with no rank to the same as its accepted, yes. Thats probably better than adding that in the sources.

Only thing left for you would then to run a new import and sync.

yroskov commented 2 years ago

@mdoering, probably, it would be a better solution. But pls wait until we discuss this problem with Geoff.