njdowdy / tpt-taxonomy

Foundational taxonomic resources for the TPT project
GNU General Public License v3.0
6 stars 1 forks source link

expected acari synonym [Holothyrus expolitissimus] to have genus [Holothyrus], instead [Haplothyrus] was found #1

Closed jhpoelen closed 2 years ago

jhpoelen commented 2 years ago

expected acari synonym [Holothyrus expolitissimus] with GBIF:6892348 to have genus [Holothyrus], instead [Haplothyrus] was found in line 6 of:

https://github.com/njdowdy/tpt-taxonomy/blob/d917f519b3b31f5ea2a699f5834c320c5018cce4/acari-standardized.csv

@EMTuckerLabUMMZ please let me know if I am reading the schema properly.

jhpoelen commented 2 years ago

related to https://github.com/globalbioticinteractions/globalbioticinteractions/issues/694

EMTuckerLab commented 2 years ago

@jhpoelen you are reading it correctly - it appears to be an error that was missed in the original files. Line 6 should have Holothyrus in column AG. I will fix it and re-upload.

jhpoelen commented 2 years ago

@EMTuckerLabUMMZ thanks for your prompt reply. At a quick glance, the issue appears to concern other synonyms also. Can you confirm?

njdowdy commented 2 years ago

Most likely, this is an underlying issue with the data as provided to us. This is a case where I would suggest contacting the relevant source taxonomy contact and taxonomy resource contact listed in the readme. Unfortunate, but they will be best suited to provide more insight.

jhpoelen commented 2 years ago

@njdowdy @EMTuckerLabUMMZ Thanks for chiming in and responding to my comments.

in #6 I describe how GBIF seems to be the source of the synonyms. I am wondering what method was used to extract these synonyms. Also, it is not clear which version of the GBIF backbone taxonomy was used. If needed, we can use a versioned copy of GBIF backbone taxonomy and reproduce the synonym lists.

njdowdy commented 2 years ago

It may have been done long before TPT existed. So, the folks who provided the source taxonomy (and possibly the person who generated the resource) would know best. I do not think the TPT team pulled anything from GBIF to add to these resources (at least I did not for the resources I worked on). But it is possible there was some miscommunication there. I'll ping Teresa and Vijay for more info.

Jegelewicz commented 2 years ago

@jhpoelen GBIF provides the classification for the accepted name with a synonym. I find it frustrating and confusing - see also my tweets with David Shorthouse and Rod Page - https://twitter.com/TJegelewicz/status/1460622162843238400

Jegelewicz commented 2 years ago

I definitely added GBIF stuff to TPT files! That was part of the "harmonization". For Siphonaptera, I had an expert review who helped me weed out stuff, but the Acari are still out for review.

Jegelewicz commented 2 years ago

versioned copy of GBIF backbone taxonomy

@njdowdy should have the info as he downloaded the entire backbone and extracted sections of it for me.

njdowdy commented 2 years ago

Ah, ok! I kept my harmonization separated from the base taxonomies obtained from the experts. With that info, I can say that I have a copy of the GBIF backbone that was used in this process. @Jegelewicz did you ultimately use the more recent GBIF download we took (IIRC the one in March 2021 changed drastically from the one in April 2021 due to GBIF dropping a dataset containing holotype names). I hope I have those details correct.

njdowdy commented 2 years ago

@Jegelewicz in places where GBIF had names that were not represented in the TPT resource, did you keep those separated as "requiring taxonomy expert review"? That was my approach. Thanks for your input!

Jegelewicz commented 2 years ago

@njdowdy yes to the expert review - especially for Siponaptera. Anything that was added from GBIF was also added to the source list from BYU as for the Acari, in the file sent out for review the GBIF "additions" are marked as such. See https://github.com/Jegelewicz/tpt-acari/blob/main/output/acari_merged.csv

Jegelewicz commented 2 years ago

did you ultimately use the more recent GBIF download we took (IIRC the one in March 2021 changed drastically from the one in April 2021

Yes. In my Siphonaptera repo input files you can see the two versions. For Acari, I didn't start on that until after you grabbed the second set of GBIF data.

njdowdy commented 2 years ago

So @jhpoelen I have an "accessed date" for the GBIF backbone used in that process and a local copy of it such that the query would be reproducible. That date was 2021-04-13.

jhpoelen commented 2 years ago

Using recently introduced support for TPT taxonomy in Nomer, I found:

$ echo -e "\tHolothyrus expolitissimus" | nomer append tpt
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [tpt]
[main] INFO org.globalbioticinteractions.nomer.match.TPTTaxonService - DwC taxonomy already indexed at [/home/jorrit/.cache/nomer/gbif/gbif], no need to import.
    Holothyrus expolitissimus   SYNONYM_OF  GBIF:acari_6892347  Haplothyrus expolitissimus  species     Animalia | Arthropoda | Arachnida | Holothyrida | Holothyridae | Haplothyrus | expolitissimus       kingdom | phylum | class | order | family | genus | specificEpithet http://www.gbif.org/species/acari_6892347   

Also, I found that

$ curl -L https://raw.githubusercontent.com/njdowdy/tpt-taxonomy/main/Acari/Acari-standardized-v2.csv | grep "Holothyrus expolitissimus"
GBIF,acari_6892348,,6892347,4663001,,,,,"Holothyrus expolitissimus Berlese, 1923",Haplothyrus expolitissimus,,,,,,,Animalia,Arthropoda,Arachnida,,,Holothyrida,,,,,,Holothyridae,,,,Holothyrus,,expolitissimus,,species,,"Berlese, 1923",,,synonym,,,Holothyrus expolitissimus

meaning that the current genus of Holothyrus expolitissimus is now listed as Holothyridae.

@Jegelewicz @njdowdy @EMTuckerLab thanks for addressing this issue!