Open johnvanbreda opened 3 years ago
@DavidRoy do you have an example project (e.g. an iRecord activity, or an app) that you'd like to try this against? I suggest as a starting point I just write some example queries and Elasticsearch requests so we can "play" with the outputs before formalising it into a module.
@johnvanbreda Survey 101 please as a test of this
@johnvanbreda would also be good to enable this to work at the level of an indivual's recording of a recording scheme (or taxon_group), e.g. John's metrics for recording butterflies, ladybirds, plants....
Thanks @DavidRoy. Is the intention to be able to show a complete dataset of all recorders for a project (like a league table), or is the intention just for each recorder to be able to view their own statistics?
@johnvanbreda Good question. This is mostly about the individual's metrics but the logical extension is to see how an individual compares with everyone else. So both please. Could be tackled in stage, with individual metrics done first? Then work out how to process all recorders? Maybe needs some cache tables
@DavidRoy I've managed to write code which generates metrics for an list of individuals on the fly from Elasticsearch. Calculating for individual users, or a short list of users, is very fast as long as we pre-calculate and cache the species list and associated records count for the project (in this case survey 101). We may of course need to pre-calculate the results if comparing across all the users of the app for example.
The only metric I've not tackled is active area size. We could try to use R to do these calculations but that will require an offline dataset for R to work against. If the calculation method can be easily described I could look at the possibility of calculating from within PostGIS but I suspect it will be slow.
Given the above, what would be a suitable output for this project? I.e. how would the user access this information and from where?
Thanks @johnvanbreda. The initial use case is a richer summary report within the iRecord butterflies app. e.g. an extension of https://github.com/NERC-CEH/irecord-butterflies-app/issues/18
@DavidRoy @kazlauskis I've now added a new end-point to the iRecord Indicia API at /api/v1/advanced_reports/user-stats. The advanced-reports path is intended to group together reports that have custom processing included, i.e. not just an Elasticsearch or PostgreSQL request. The user-stats end-point specifically returns recorder metrics information designed to replace the existing butterfly app user metrics (https://github.com/NERC-CEH/irecord-butterflies-app/issues/18) as well as include the metrics described here.
The end-point should be accessible to the app in the same way that it uses the indicia_api module (with the same authentication). You can pass a survey_id or group_id get parameter to limit it to a survey - but note this is designed for surveys with a limited set of species rather than a general recording survey (due to the need to calculate rarity data across the entire dataset). E.g. https://www.brc.ac.uk/irecord/api/v1/advanced_reports/user-stats?survey_id=101 which gives the following response:
{
"myTotalRecords":2529,
"projectRecordsCount":510584,
"projectSpeciesCount":111,
"myProjectRecords":2475,
"myProjectSpecies":52,
"myProjectRecordsThisYear":3,
"myProjectSpeciesThisYear":3,
"myProjectSpeciesRatio":46.8,
"myProjectActivityRatio":38.9,
"myProjectRarityMetric":0
}
In this context, "project" means the filter you applied via the survey_id or group_id parameter. "myTotal" means the the user's records within the entire set of reporting data for iRecord.
@BirenRathod presumably this new code should be added to the Drupal 8 version of the module as well?
@johnvanbreda yes. we need on Drupal 8/9 too.
Ok, @BirenRathod we'll need to do this when we get back onto the Drupal 9 migration task.
Initial request discussion:
And a response: