dronefly-garden / dronefly

Red Discord Bot V3 cogs for naturalists.
Other
16 stars 3 forks source link

reactions: ensure API calls kept to minimum necessary for counts #98

Open synrg opened 4 years ago

synrg commented 4 years ago

See #84 comment about 2 calls happening where one would suffice when querying for user stats. If that's still a problem, fix it, otherwise close this issue.

synrg commented 3 years ago

OK, I think any issues we saw in the past with this #84 had to do with mixing lookup-by-user_id with lookup-by-login_id. This is a consequence of the way we parse the message contents to dig out of it the users that are in the table. So that half of the problem is already solved. See the forum discussion here: https://forum.inaturalist.org/t/observation-counts-for-two-users-combined-less-than-each-individually/13552

Since we're going to move away from the approach that causes the user id vs. login id inconsistency in #141 , that only leaves the valid observation that we do too many API calls. But that's fairly easy to solve. Referring back to the API doc again, I have found that we could refetch all the data for all users, plus the total species_count in only three calls, no matter how many users are in the table, up to the per-page maximum for /v1/observations/observers which is 500 (far larger than is practical for browsing all of them in Discord).

e.g. consider the following display, and the three API calls beneath it:

image

https://api.inaturalist.org/v1/observations/observers?user_id=545640,1276353&taxon_id=47208 https://api.inaturalist.org/v1/observations?user_id=545640,1276353&taxon_id=47208&per_page=0 https://api.inaturalist.org/v1/observations/species_counts?user_id=545640,1276353&taxon_id=47208&per_page=0

The first call gives us a set of records per observer in the user_id parameter, and each record contains a tally of observations and species (actually leaf-node taxa, though you could constrain the display to hrank=species to get that). The second two tally up the total number of observations and the total number of species across all users.

The observation_count & species_count, per observer, from the first call:

{
    "total_results": 2,
    "page": 1,
    "per_page": 500,
    "results": [
        {
            "user_id": 545640,
            "observation_count": 288,
            "species_count": 80,
            "user": {
                "id": 545640,
                "login": "benarmstrong",
            }
        },
        {
            "user_id": 1276353,
            "observation_count": 93,
            "species_count": 30,
            "user": {
                "id": 1276353,
                "login": "michaelpirrello",
            }
        }
    ]
}

And from the second & third, respectively, the observations and species counts:

{
    "total_results": 381,
    "page": 1,
    "per_page": 0,
    "results": []
}
{
    "total_results": 162,
    "page": 1,
    "per_page": 0,
    "results": []
}
synrg commented 3 years ago

I didn't notice the discrepancy until after I pasted the output. /v1/observations/observers output actual species_count, not leaf node taxa! That's going to be a problem. So may we need an API feature request after all, if this can't be worked around.