CatalogueOfLife / data

Repository for COL content
8 stars 2 forks source link

Aves classification without intermediate ranks #207

Open mdoering opened 3 years ago

mdoering commented 3 years ago

I was asked to provide the COL Aves species in a taxonomic order, starting with Ostriches and ending with Passerines, e.g. as given by the IOC or Cornell lists: https://www.worldbirdnames.org/new/classification/orders-of-birds-draft/ https://www.birds.cornell.edu/clementschecklist/overview-august-2019/

COL can only provide such an order, if we have the intermediate ranks that group the sister taxa Apterygiformes and Casuariiformes into some shared parent taxon.

ITIS which COL uses only has the classic Linnean ranks for birds: https://www.itis.gov/servlet/SingleRpt/SingleRpt?search_topic=TSN&search_value=179628#null

We therefore cannot reproduce such a "taxonomic" order in any of the COL tools. Should we strive for adding more intermediate birds ranks like subclass, superorder, suborder or superfamily?

yroskov commented 3 years ago

Forwarded to ITIS.

CoL cannot change classification approved and supplied by ITIS. However, GBIF can try to do it as a separate project to serve a client.

mdoering commented 3 years ago

... it is not a GBIF client, these are COL users.

mdoering commented 3 years ago

The taxonomic ordering in ChecklistBank is purely based on the tree. If all bird orders are directly under the class Aves we can only sort them alphabetically. To sort them taxonomically we need grouping taxa in between Aves and its orders. Ultimately a phylogenetic tree just like IOC has that is underpinning the taxonomic ordering:

AVES Tree

If the intermediate taxa were of rank=CLADE or UNRANKED we could easily blend them out in the tree, but use them for sorting children in a taxonomic, i.e. phylogenetic sequence.

mdoering commented 3 years ago

Moving discussion on how to taxonomically sort trees in the API to a new backend issue: https://github.com/CatalogueOfLife/backend/issues/906

mdoering commented 3 years ago

For ordering we will likely introduce a new sequenceIndex field that optionally can be given to define the taxonomic order of child taxa. See https://github.com/CatalogueOfLife/coldp/issues/42

DaveNicolson commented 3 years ago

Can you help me understand one aspect of this, please? Assuming intermediate ranks are added to COL via ITIS, how would this allow you to display, for example, Struthioniformes first and Rheiformes second? Of course R comes before S alphabetically, so even if both linked to parent as Paleognathae (which also comes alphabetically AFTER Neognathae), how do you order them for display?

DaveNicolson commented 3 years ago

I do see the sequenceIndex noted above, but if you used that then wouldn't you be able to order them taxonomically even without the intermediate ranks? Why are these issues (intermediate ranks vs. sort) not orthogonal?

mdoering commented 3 years ago

@DaveNicolson I had hoped to solve the ordering just by having the right amount of intermediate ranks to group taxa so an alphabetical ordering on the same ran level would have solved it. But this Aves case demonstrates that this is not enough and we need to have a way to define a full custom ordering by supplying a sequenceIndex number for children of the same parent.

To know that Palaeognathae should come before Neognathae requires the sequenceIndex. There seems to be lots of historic information involved in this ordering that cannot be explained just by phylogeny.

DaveNicolson commented 3 years ago

We actually have an element in ITIS that was intended to hold the values used for a phylogenetic sort, but nothing was ever implemented in the end. The (empty) element remains, though... taxonomic_units.phylo_sort_seq... The idea was to populate it when working on a group, but it seems like it is already enough work building the basic taxonomy, and the appetite to add the content & functionality never went anywhere, as there were always other fish needing frying.

mdoering commented 2 years ago

see also https://github.com/gbif/checklistbank/issues/108