dronefly-garden / dronefly

Red Discord Bot V3 cogs for naturalists.
Other
16 stars 3 forks source link

taxon: include counts of immediate children #118

Closed synrg closed 3 months ago

synrg commented 4 years ago

This is an idea raised by Jared (blazeclaw) on iNat Discord and refined a bit as we talked about it on #bot-stuff:

Jared: would it be possible to show the number of children that inat recognizes for a taxa somewhere? image Jared: so for example this genus has three species in it Jared: It would be nice to get an idea of that so you know roughly how many things you need in a group Ben: how about listing the levels down to a certain floor, where the floor is the "next major division" Ben: so, for Arachnida, "is a class with 1 subclass, 2 superorders, and 17 orders, that has 1049494 observations in:" Jared: Maybe family and lower can show species? Ben: there are some deep, deep trees from family to species Jared: true. especially in beetles Jared: so maybe just always show 3 levels down Jared: say that inat has a family with subfamilies, tribes, genera, and species then you just ignore species

synrg commented 4 years ago

There's a practical consideration here that limits how much information we can show without delaying update of the display to go fetch more in additional API calls. So if we stick to just what can be obtained from the initial call to show Arachnida, we have:

image

i.e. we know it has a subclass, and we know it has some orders, but we don't know the superorders within that subclass, nor the orders within those superclasses, not without additional API calls to enumerate their children, with a cost of 1s delay per API call due to rate-limiting, for a total added cost of 3 more seconds to enumerate how many orders, total. I don't think we want to delay the initial display to get all of that info. And this is just a simple case where the children we'd need to enumerate aren't too numerous. There are some much worse cases where the immediate children are numerous, and there are intermediate levels that would, in turn, need to be fetched to enumerate all of the grandchildren ... it could take minutes to count them all up at a rate of one per second. So I don't think we can reasonably implement my idea "is a class with 1 subclass, 2 superorders, and 17 orders", which looks easy, but only works here because there aren't many children to enumerate down to the target "floor" rank for enumeration.

synrg commented 4 years ago

New idea, using only what's in the "children" field returned from /v1/taxa, would be to say: "Arachnida is a class containing Subclass Acari, and 10 additional Orders", where it tells you the names of things if there are up to 3, but summarizes them with a number if there are more.

In addition to this, it has come to my attention that we say nothing about extant vs extinct children, and probably should! I'll file a separate issue about this.

Indeed, there seems to be a problem sometimes in the iNat data that the About page will describe a taxon as extinct (e.g. suborder Protocoleoptera) and indeed if you browse down to the grandchildren, you can see that, but the levels of children between are not, themselves, marked extinct. Therefore, it would not be practical right now to have the taxon display list that Order Coleoptera has immediate children of 4 extant suborders, 1 extinct suborder, and 1 extinct genus until the iNat data is revised to mark that suborder extinct, due to the expensive cost of traversal to discover the subtree has nothing but extinct nodes (i.e. traversal can stop and does not need to proceed down to the next level when such a node is encountered, but that doesn't help much ... it's still too costly!) I have an outstanding question to #inat-questions on the iNat Discord about this, and will go seek help on the forums if it can't be answered there.

synrg commented 3 months ago

In main today at last, ,taxon list (alias ,t list) will list just the direct children and their counts. You can include sort by obs to sort them by children with the most observations first.