Open jhpoelen opened 4 years ago
Probably needless to say that someone hopefully comes up with better and reusable simple tool to provide scalable, performant offline-enabled taxonomic name matching to avoid having to re-invent the wheel.
Rather than refuting the interaction would it be better if I update my list with ITIS name IDs? I suspect you're going to want both.
:+1: I much like your idea to add ITIS ids (or other ids of taxonomic schemes that you prefer) to set a taxonomic context. This would help to you to explicitly point GloBI (or other users) to the taxa you'd like to include. For instance, using Nomer, the name match against ITIS:19081\tFicus
:
$ echo -e "ITIS:19081\tFicus" | nomer append
ITIS:19081 Ficus SAME_AS ITIS:19081 Ficus genus Plantae | Viridiplantae | Streptophyta | Embryophyta | Tracheophyta | Spermatophytina | Magnoliopsida | Rosanae | Rosales | Moraceae | Ficus ITIS:202422 | ITIS:954898 | ITIS:846494 | ITIS:954900 | ITIS:846496 | ITIS:846504 | ITIS:18063 | ITIS:846548 | ITIS:24057 | ITIS:19063 | ITIS:19081 kingdom | subkingdom | infrakingdom | superphylum | phylum | subphylum | class | superorder | order | family | genus http://eol.org/pages/60627
$
And, time permitting, I suspect that your expert contributions of refuted interaction claims now would be a useful way to spot suspicious interactions for years to come.
fyi @jhammock @KatjaSchulz
I started working on this. I primarily used ITIS, but where the name is not in ITIS I had to go elsewhere. What prefixes should I be using with other resources, such as the Catalogue of Life and Index Fungroum?
Also, if the paper referenced uses a synonym of, what is now, an accepted name, should I correct that name, give the correct ID to the accepted name or something else. For some reason many of these resources do not have identifiers for the synonyms.
@qgroom you can find the prefixes that GloBI currently supports at https://api.globalbioticinteractions.org/prefixes (json) or https://api.globalbioticinteractions.org/prefixes?type=tsv (tsv) or https://api.globalbioticinteractions.org/prefixes?type=csv (csv) . Happy to add support for additional ones.
As far as the transcription goes - I'd personally leave the name of the original and leave the taxonomic interpretation of downstream system: the original won't change, but the taxonomic interpretation might.
btw - @qgroom would you advise for GloBI to support Catalogue of Life ids?
Re: CoL. I'm not sure. They used to use LSIDs, but these don't seem to be displayed now. However, the GUID is still the same in the URL. I'm inclined to think that they are as stable as any other system. These identifiers are available in GBIF too e.g. https://www.gbif.org/species/153643127/verbatim
as @qgroom noted in a slack message (can't link b/c the slack isn't open ; ( ):
BTW: I spotted one little error. Ficus is both a plant and a gastropod. Is this a usecase for your "refute" template? (edited)
my reply:
@Quentin Groom Thanks for pointing this out. Please record the issue at https://github.com/globalbioticinteractions/globalbioticinteractions/issues/new with example. And yes, this is an excellent example in which you (as an expert) can refute claims like "flowers of gastropods are visited by bats" :slightly_smiling_face:. Homonyms are expected to cause a fuss (as usual) and detection method exist beyond spot checking. e.g., https://doi.org/10.7717/peerj-cs.164 .
I realized that many homonyms might be hard to spot / detect for humans, even though the name linkages help to easily detect them computationally.
In the case of
Ficus
, Nomer (see https://github.com/globalbioticinteractions/nomer), a globi name matching tool, nicely reports the homonyms when using$ echo -e "\tFicus" | nomer append
like:However, when providing a taxonomic context like Plantae, via
$ echo -e "\tPlantae | Ficus"
a conflict no longer occurs.To help more easily detect naming issues, I am thinking to label taxa that have inconsistent linkages (e.g., homonyms or other ambiguous links).