dronefly-garden / dronefly

Red Discord Bot V3 cogs for naturalists.
Other
16 stars 3 forks source link

tab: species comparison table #152

Open synrg opened 3 years ago

synrg commented 3 years ago

AntrozousAAA asked on iNat Discord if there could be a way to do comparisons. This idea has come up before, but was never written down. It would be something like this tool: https://www.inaturalist.org/observations/compare

It would have to be limited to the first 500 species per user to keep the number of API calls down to a minimum. See also #98 which is relevant to solving this problem. In that issue I propose that there could be a reduction in number of API calls if /v1/observations/observers supported leaf node taxa counts. However, for this issue we would not just be tallying up the counts, but also would need to iterate over all of the taxa per user in the comparison. That requires one /v1/observations/species_counts call per user. There's no way around that. Two things might help with this:

  1. Better caching control. Right now, the requests I mentioned in the paragraph above are uncached. That won't fly here, otherwise rather large inconsistencies are going to creep into the table. Every time a user is added to the table, all of the users would have to be tallied up again to compute the new set intersections. The only way around that is by caching the results.
  2. Produce a display immediately and do the lengthy operation in the background, updating the table when results are finally available.
    • Anything requiring a lot of API calls like this cannot be done without significant delays. It will cause confusion if the bot doesn't give some immediate feedback of some kind that the operation has started, and that it is still working on the results.
synrg commented 3 years ago

One way to do the split between immediate results & background operation is when a user presses :hash: a single API call is done to fetch their observation & species counts, and at that time, everyone's "in common" / "not in common" / "unique" counts would be set to "?" in the table until the new results are available and a message shown somewhere in the display: Comparison in progress... or so. A job would then be started that fetches as many users as there are in the table and does the set operations on all of the results. When the results are finally all received and tallied up, the "?" numbers in the table would be filled in.

synrg commented 3 years ago

Mockup of the comparison stats summary updated taking into consideration some stuff from my later comments.

image

To regenerate this mockup using EmbedUtils -embed command:

-embed fromdata

{"color": 9498256,
 "description": "**Species comparison:** *computing ...*\n__obs# (spp#) by user: same / diff / uniq:__\n[22 (7)](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=545640) benarmstrong: ? / ? / ?\n[11 (3)](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=1276353) michaelpirrello: ? / ? / ?\n[33 (9)](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=benarmstrong%2Cmichaelpirrello) *total*",
 "title": "Observations of Order Anura (Frogs)",
 "type": "rich",
 "url": "https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979"}

The same / diff / uniq stats could each be a link to search observations in the set on the web, with &view=species. Later, we could add navigation reaction buttons (a la ,search nav buttons) to support:

Press ✅ to display: same for selected user (other options: total, different, unique)

Then if the user pressed ✅, a new display would be generated, showing a paginated display of observation counts of matching species, selecting only the species that are same, total, different, or unique for the selected user in the table vs. all others by default. This comment goes into details about that display: https://github.com/synrg/dronefly/issues/152#issuecomment-852048257

synrg commented 3 years ago

Observation count is not relevant for the initial species comparison display, so that should be dropped. Each link added to the display pushes more against the 2048 character maximum for Discord embed description field. That link can get rather long as more search criteria are added (e.g. quality_grade=research, etc.) and that limits how many users can be shown on the page, and fixed-width columns can't have links, so it's better to only have a single link per user in the table. Varying-length usernames cause "jitter" in the alignment of figures in the table, so they go last and the link should be for the username. Therefore, taking all of those into consideration, it might be better to change the table layout to:

image

-embed fromdata

{"color": 9498256,
 "description": "**Species comparison:** *computing ...*\n__`tot same diff uniq` spp# per user__\n`  7    ?    ?    ?` [benarmstrong](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=545640)\n`  3    ?    ?    ?` [michaelpirrello](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=1276353)\n`  9               ` [total](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=545640%2C1276353)",
 "title": "Observations of Order Anura (Frogs)",
 "type": "rich",
 "url": "https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979"}

i.e. The username link would be the total number of observations per user by default. As the user cycles through the now four options with 🔢 reaction (tot, same, diff, uniq), the link would be updated to do a search for only the taxon id#s for each column, respectively, as well as changing which details are shown if they press ✅ for that user.

synrg commented 3 years ago

Mockup of per species detail when ✅ is pressed with default 🔢 tot and first user benarmstrong selected:

image

-embed fromdata

{"color": 9498256,
 "description": "**Obs. comparison per species vs. michaelpirrello:** \n__`own vs. others` **tot** obs# per species__\n`  8          0` [Anaxyrus americanus](https://www.inaturalist.org/observations?verifiable=any&taxon_id=64968&user_id=545640)\n`  0          2` [Anaxyrus fowleri](https://www.inaturalist.org/observations?verifiable=any&taxon_id=64977&user_id=1276353)\n`  8          2` [total](https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=545640%2C1276353)",
 "title": "Species of Order Anura (Frogs) by benarmstrong",
 "type": "rich",
 "url": "https://www.inaturalist.org/observations?verifiable=any&taxon_id=20979&user_id=545640"}
synrg commented 3 years ago

The resulting "Obs. comparison per species vs. ..." display in the mockup in the previous comment has two columns, one for the first user, and the other for "others" (since more than one user can be in the table. That raises a point about how you select to see details. It would either be pairwise or one person vs. everyone else, with reaction buttons to select who is in the detailed comparison. If it's pairwise, then the label of the "others" column should change to "other" for clarity. If the table is for only two users, then details would automatically be pairwise.

It looks like there is enough width to do more than two users in the mockup, but that's deceptive. The species names are short in this display, and could be longer. Also, on mobile, there's less available width than you think before table rows start to wrap. Therefore, for simplicity's sake, I think limiting it to two numbers is best, either one user vs. another or one user vs. everyone else (if more than two users in the per-user table).

The obs. comparison per species display should have reaction buttons to cycle through the users and change which stat is shown. The title of that display could be changed for three users to "... by benarmstrong and 2 others", or if only two, "... by benarmstrong and michaelpirrello". The title URL would then change from all of benarmstrong's observations of Anura to all of everyone's observations of Anura.

To give complete control over which user is selected, a :one: button could cycle through the available users for own and a :two: button could cycle through all the available users for other, with an extra option of all others for all others.

Then the 🔢 can appear here again for selecting among the different sets of stats: all, same, different, unique. the counts in the tables would always be the same regardless of which of these was selected: