AtlasOfLivingAustralia / name-preprocessing

Name source preprocessing for the ALA taxonomic index
Other
0 stars 1 forks source link

COL Preprocessing re-allocates Insect Genus Bacteria from Animalia to Bacteria Kingdom #14

Open Sherrin-ALA opened 1 year ago

Sherrin-ALA commented 1 year ago

Preprocessing of the Catalog of Life data re-allocates the insect genus Bacteria from the Animalia kingdom to the Bacteria kingdom.

Sherrin-ALA commented 1 year ago

Problem occurs in doc/transform.py When recursively constructing the hierarchy, when there a list of permitted kingdoms - which is the case for CoL - if the name of the entry matches an entry in the permitted kingdoms list, that name is inserted into the kingdom field, even if its taxonomic rank is genus.

Added checking to that section of the if statement to check if the rank is kingdom or unranked, as while we treat Viruses as a kingdom, it is unranked in CoL. Without the check for unranked, Viruses do not get their kingdom set properly.