globalbioticinteractions / nomer

maps identifiers and names to other identifiers and names
GNU General Public License v3.0
18 stars 3 forks source link

Toxicodendron oligophyllum is included in COL, but does not match [nomer append col] #119

Closed jhpoelen closed 1 year ago

jhpoelen commented 1 year ago
echo -e "\tToxicodendron oligophyllum" | nomer append col
[main] INFO org.globalbioticinteractions.nomer.match.TermMatcherRegistry - using matcher [col]
[main] INFO org.globalbioticinteractions.nomer.match.CatalogueOfLifeTaxonService - [Catalogue of Life] taxonomy already indexed at [/home/jorrit/.cache/nomer/catalogue_of_life/catalogue_of_life], no need to import.
    Toxicodendron oligophyllum  NONE        Toxicodendron oligophyllum                          

Likely need to upgrade COL version in Nomer Corpus of Taxonomic resources, but need to verify.

The name came up during TDWG 2022 Sym01 talk and slack.

Screenshot from 2022-10-20 06-56-29 Screenshot from 2022-10-20 06-56-13

jhpoelen commented 1 year ago

as expected, the newer version of COL does match -

$ nomer clean
$ echo -e "\tToxicodendron oligophyllum" | nomer append --properties my.properties col
    Toxicodendron oligophyllum  HAS_ACCEPTED_NAME   COL:98Y55   Toxicodendron oligophyllum  species     Biota | Plantae | Tracheophyta | Magnoliopsida | Sapindales | Anacardiaceae | Toxicodendron | Toxicodendron oligophyllum    COL:5T6MX | COL:P | COL:TP | COL:MG | COL:3ZY | COL:6CK | COL:7Y43 | COL:98Y55  unranked | kingdom | phylum | class | order | family | genus | species  https://www.catalogueoflife.org/data/taxon/98Y55

after disabling version control for the taxonomies using:

$ cat my.properties 
nomer.preston.dir=
nomer.preston.remotes=
nomer.preston.version=
jhpoelen commented 1 year ago

tracked versions of COL in Nomer's Corpus of Taxonomic Resources -

$ preston alias --remote https://zenodo.org/record/7196029/files "https://download.catalogueoflife.org/col/latest_coldp.zip"
<https://download.catalogueoflife.org/col/latest_coldp.zip> <http://purl.org/pav/hasVersion> <hash://sha256/5a7731841c26a76e8c5da2f9b413f413c8cdfcabe7a57d9ac636bd2136ed64d8> <urn:uuid:24882095-69b4-4a0a-b9aa-492db73a787d> .
<https://download.catalogueoflife.org/col/latest_coldp.zip> <http://purl.org/pav/hasVersion> <hash://sha256/428d1a32d0747ec2cc36cd276bcdda8e43a4cc452f6edc767eda2b0027d5f1e9> <urn:uuid:9ba86e80-c1de-480e-aeac-edac3d44c81b> .
<https://download.catalogueoflife.org/col/latest_coldp.zip> <http://purl.org/pav/hasVersion> <hash://sha256/9ac28297a996e02f6026c40d24e67f59f7f39d495bb45759ebc4adb475d51f63> <urn:uuid:f4c99a9e-401f-48cf-b742-23408d17a4f3> .
jhpoelen commented 1 year ago

And, in the last version of COL included in Nomer's Corpus of Taxonomic Resources does include Toxicodendron oligophyllum -

preston alias --remote https://zenodo.org/record/7196029/files "https://download.catalogueoflife.org/col/latest_coldp.zip"\
 | tail -n1\
 | preston grep --remote https://zenodo.org/record/7196029/files "Toxicodendron oligophyllum"
<line:zip:hash://sha256/9ac28297a996e02f6026c40d24e67f59f7f39d495bb45759ebc4adb475d51f63!/NameUsage.tsv!/L3899965> <http://www.w3.org/ns/prov#value> "98Y55 1141    7Y43        accepted    Toxicodendron oligophyllum  S. L. Tang, Liang Ma & S. P. Chen   species         Toxicodendron       oligophyllum                1ff743de-b063-4788-9c77-fe56972c4359        58      botanical                   1ff743de-b063-4788-9c77-fe56972c4359                false                               http://www.worldplants.de/?deeplink=Toxicodendron-oligophyllum      " <urn:uuid:b93cc86e-1e07-47dd-bbfa-f8cf5e376965> .

which is suggesting that Nomer is picking an older version of COL in Nomer's Corpus of Taxonomic Resources instead of the most recent one to work with.

jhpoelen commented 1 year ago

with most recent nomer version, the expected result was generated

echo -e "\tToxicodendron oligophyllum" | nomer append col
    Toxicodendron oligophyllum  HAS_ACCEPTED_NAME   COL:98Y55   Toxicodendron oligophyllum  S. L. Tang, Liang Ma & S. P. Chen   speciesBiota | Plantae | Tracheophyta | Magnoliopsida | Sapindales | Anacardiaceae | Toxicodendron | Toxicodendron oligophyllum COL:5T6MX | COL:P | COL:TP | COL:MG | COL:3ZY | COL:6CK | COL:7Y43 | COL:98Y55  unranked | kingdom | phylum | class | order | family | genus | species      https://www.catalogueoflife.org/data/taxon/98Y55