aodn / nrmn-application

A web application for collation, validation, and storage of all data obtained during surveys conducted by the NRMN
GNU General Public License v3.0
4 stars 3 forks source link

Species_id mapping in species_list view #1293

Closed bpasquer closed 1 year ago

bpasquer commented 1 year ago

Originally posted by @atcooper1 in https://github.com/aodn/nrmn-application/issues/1280#issuecomment-1603482006

i think there needs to be some automated scraping of WoRMS to maintain consistency, but ultimately, i'm not sure views should be based around inclusion rules, rather exclusion rules (to filter out unnecessary species)?

In addition to the 2 issues listed above, there also seems to be another problem with the superseding of species and the public species list views (which are currently affecting all our online taxonomic database users (GBIF, OBIS, ALA, RSoW). I'll attempt to convey the issue below, but it is a confusing one to pen out, so apologies if you have to read it 10 times and it still doesn't make sense! ep_species_list_data includes superseded_ids & superseded_names, indicating which IDs & names were superseded by the species, but because species IDs were changed by AODN there is now ambiguity with 'mapped_id': These data users are trying to use superseded_id to map the species list correctly, but this refers to species_id (new NRMN system), not mapped_id (old IMAS system). For example, Morwong fuscus has superseded_id=171, which appears to be a new NRMN ID (aka species_id). However 171 is Notolabrus gymnogenis (mapped_id). So there is an issue for these online databases, where superseded_ids can't be used for updates because the data for the IDs mapped from the old IMAS species list to those superseded_ids isn't carried across....

Response from Bene : True, IDs in the endpoint don't relate to the old IMAS system, which can be difficult to use for external users. To rectify that either IDs in public endpoints should be mapped to the old IMAS system, or new mapped variable should be added to the endpoints (which can be confusing for the general public).

Toni: The problem is they can't be mapped back to the old IMAS system. Morwong fuscus has a superseded_id of 171, but species_id 171 returns null value in ep_species_list. Users end up in a null loop. Currently the public endpoints and private endpoints are displaying very different data (due to the three errors identified above). Do the NRMN layers need to be hidden until these can be resolved? Which leaves us with a mapped variable option?