gnames / gnverifier

GNverifier verifies scientific names against more than 100 biodiversity databases
https://verifier.globalnames.org
MIT License
20 stars 1 forks source link

Wrong OTL ids returned #24

Open Adafede opened 3 years ago

Adafede commented 3 years ago

Hi,

I am actually comparing the Open Tree of Life IDs retrieved via GNVerify and via rotl (official Open Tree of Life API). They give almost identical results (which is good!) but sadly in some cases, they differ.

Here is an example:

echo "Petroselinum crispum" | gnverify -s 179 -f pretty

giving

"recordId": "959097"

where when doing it via rotl (in R):

library(rotl)

name <- "Petroselinum crispum"

tnrs_match_names(
  names = name,
  do_approximate_matching = FALSE,
  include_suppressed = FALSE
)

search_string unique_name approximate_match ott_id is_synonym flags 1 petroselinum crispum Petroselinum crispum FALSE 2485 FALSE number_matches 1 1

Which indeed verifies:

https://tree.opentreeoflife.org/taxonomy/browse?name=2485

vs

https://tree.opentreeoflife.org/taxonomy/browse?name=959097

Thank you again for your wonderful work, hope those issues help!

dimus commented 3 years ago

I wonder if they made updates to OTT, but did not publish it yet. It seems they still have OTT v3.2 from 2019 for download

https://tree.opentreeoflife.org/about/taxonomy-version/ott3.2

Adafede commented 3 years ago

When looking at both

https://tree.opentreeoflife.org/taxonomy/browse?name=2485

vs

https://tree.opentreeoflife.org/taxonomy/browse?name=959097

Couldn't it be that you have only the first part of the line being "Petroselinum crispum" in your data and that the "Neapolitanum Group" was cropped? It could maybe explain it...

I have the local ott3.2 version and they are present in this exact same way.

dimus commented 3 years ago

I see, I think it is a bug. gnverify should return as the best result https://tree.opentreeoflife.org/taxonomy/browse?name=2485, because it was parsed clearly.