inbo / riparias-prep

Preparatory scripts and data management for the RIPARIAS workflow
MIT License
0 stars 1 forks source link

Climate matching issue due to taxonomic synonyms ? #10

Closed adrienlatli closed 3 years ago

adrienlatli commented 3 years ago

@SanderDevisscher, @timadriaens and @RomainWilleput

Hello, With Romain, we found a potential problem regarding the filters that were applied to GBIF occurrences for climate matching. For several species, we have large differences between the number of observations in GBIF and the number of observations take into account in the climate matching analysis. For example, the specie Faxonius neglectus has only 5 viable observations in our climate matching table (https://github.com/inbo/riparias-prep/blob/a1f20d61681c3941b1858b38b2b29c4339571b92/data/output/data_overlay_future_10km.csv) but in GBIF we found more than 400 observations (https://www.gbif.org/species/8848320) with a lot of current human observations.

Maybe you have used for the CM analysis the GBIF code of Orconectes neglectus, a synonyms of Faxonius neglectus, which has a pretty low number of human observations.

Do GBIF codes allow to link species by their synonyms ? Or the error coming from somewhere else?

SanderDevisscher commented 3 years ago

@adrienlatli I looked at the white list (the list I used to download the gbif occurrences) and only Orconectes neglectus is listed.

If you guys could look at the list and change the taxonkeys of the synonyms (Orconectes neglectus ~ 2227071) to these of the accepted species (Faxonius neglectus ~ 8848320). I think the synonyms will be included in the download. Adding distinct(gbif_id, keep_all = TRUE) to the subset should remove any duplicates.

Steps:

1alternatively you can do these changes outside the github workflow and drag and drop the new list as a comment to this issue. I'll commit and push it then.

PS: @timadriaens @adrienlatli @RomainWilleput do you expect the list to change frequently in the near future ? ifso I would suggest changing the list into a .gsheet. This allows for a more dynamic approach to a frequently changing list. However this requires a small rewrite to the script to be active.

RomainWilleput commented 3 years ago

Hello @SanderDevisscher With Adrien, we wonder if it is not more interesting to make all the crayfish species of GBIF pass in the CM analysis in order to have a complete database for the Alert List but also for the continuation of the project (White List, ...). I tried to download the "GBIF Backbone Taxonomy" .tsv but the file is too big to be opened with Excel. https://www.gbif.org/fr/dataset/d7dddbf4-2cf0-4f39-9b2a-bb099caae36c Is there another solution ? As a reminder, the 3 families of crayfish are Astacidae, Cambaridae and Parastacidae. https://www.gbif.org/species/8022 https://www.gbif.org/species/4479 https://www.gbif.org/species/8670 Is it possible to extract all the species of these 3 families on GBIF and to pass the whole list to the CM analysis ? To make the Alert List, I will select the accepted species in the CM analysis results for each species available in aquarium stores. Thanks in advance

SanderDevisscher commented 3 years ago

Is it possible to extract all the species of these 3 families on GBIF and to pass the whole list to the CM analysis ? To make the Alert List, I will select the accepted species in the CM analysis results for each species available in aquarium stores.

Yes this is possible see also issue #5

RomainWilleput commented 3 years ago

Yes this is possible see also issue #5

Perfect, that would be really nice! About synonyms, I think that the accepted species includes all occurrences of synonymous species. That should not be a problem.