Closed zedomel closed 8 months ago
Hey José @zedomel - Thanks for your detailed message.
By default, Nomer uses a versioned copy of taxonomic resources as captured by Preston in Nomer's Corpus of Taxonomic Resources
. So, instead of using the (dynamic and often changing) internet, Nomer relies on a well-defined versioned slice of it. And, your newer copy of GBIF's backbone it's defined in that slice (see e.g. list tracked urls/aliases in https://zenodo.org/record/6473194 ).
To disable Nomer's reliance on it's versioned corpus, you can blank out the preston properties, by changing -
$ nomer properties | grep preston
nomer.preston.dir=
nomer.preston.remotes=https://zenodo.org/record/6473194/files
nomer.preston.version=hash://sha256/d58ab1acf350f056a75bde7f4175d14c5e4dfaf0bf20e2eedbb2fb585bdf0822
to
nomer properties | grep preston
nomer.preston.dir=
nomer.preston.remotes=
nomer.preston.version=
After that reconfiguration, you'd sample the internet directly, and Nomer should download directly from the internet location.
Apologies for the confusion.
@zedomel alternatively, we could build a new version of Nomer's Corpus of Taxonomic Resources that points to your more recent copy of the GBIF backbone in addition to updating the default Nomer property config.
Perhaps easier?
Let me know.
Thank for quick answer @jhpoelen .
Build a new version sounds interesting, but for now I will blank out preston properties. I'm testing if a new version of GBIF taxonomy will provide more matches compared to current one used in nomer.
thanks.
@zedomel sounds good! Curious to hear the outcome and eager use your work to include in a future version Nomer's Corpus of Taxonomic Resources.
@zedomel I just updated Nomer's defaults to point to your recent repackaged GBIF backbone taxonomy. Hoping to include it into the next release of Nomer's Taxonomic Corpus.
@zedomel
Salim, JA. (2022). A Repackaged Taxonomic Backbone of Global Biodiversity Information Facility (GBIF) - 2021-11-26 (0.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.6707049
has been included in :
Poelen, Jorrit H. (2022). Nomer Corpus of Taxonomic Resources hash://sha256/f4e2b9806440d0605f60b81feb9782655291aac2d000c74e4e8fdeb937e29b1d (0.6) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7065661
@jhpoelen
Can you update nomer
to use a new version of GBIF Backbone: https://zenodo.org/doi/10.5281/zenodo.10810437.
I'm uisng 0.5.6
and after blank out preston options:
nomer.gbif.ids=gz:https://zenodo.org/record/10810438/files/gbif-backbone-by-id.tsv.gz!/gbif-backbone-by-id.tsv
nomer.gbif.names=gz:https://zenodo.org/record/10810438/files/gbif-backbone-by-name.tsv.gz!/gbif-backbone-by-name.tsv
nomer.preston.dir=
nomer.preston.remotes=
nomer.preston.version=
and executing:
echo -e "\tAchnanthes hauckiana" | nomer append gbif -p /tmp/append.properties
the follow exception is returned:
[main] INFO org.globalbioticinteractions.nomer.match.GBIFTaxonService - [GBIF] indexing taxonomy...
[main] INFO org.globalbioticinteractions.nomer.match.GBIFTaxonService - [GBIF] indexing ids...
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceReadOnly - using cached [gz:https://zenodo.org/record/10810438/files/gbif-backbone-by-id.tsv.gz!/gbif-backbone-by-id.tsv] at [/home/jose/.cache/nomer/f12008cf88c998ee4b765068a8a4c98b1601a0276f3d91f39087f75dbcd7f54b.gz]
java.lang.IllegalArgumentException: Name already used: nodes
at org.mapdb.DB.checkNameNotExists(DB.java:1592)
at org.mapdb.DB.createTreeMap(DB.java:834)
at org.mapdb.DB$BTreeMapMaker.make(DB.java:661)
at org.globalbioticinteractions.nomer.match.GBIFTaxonService.buildTaxonIndex(GBIFTaxonService.java:221)
at org.globalbioticinteractions.nomer.match.GBIFTaxonService.lazyInit(GBIFTaxonService.java:82)
at org.globalbioticinteractions.nomer.match.CommonTaxonService.checkInit(CommonTaxonService.java:369)
at org.globalbioticinteractions.nomer.match.CommonTaxonService.enrichNameMatches(CommonTaxonService.java:307)
at org.globalbioticinteractions.nomer.match.CommonTaxonService.match(CommonTaxonService.java:100)
at org.eol.globi.service.TermMatcherHierarchical.match(TermMatcherHierarchical.java:57)
at org.globalbioticinteractions.nomer.util.AppendingRowHandler.onRow(AppendingRowHandler.java:42)
at org.globalbioticinteractions.nomer.match.MatchUtil.apply(MatchUtil.java:85)
at org.globalbioticinteractions.nomer.match.MatchUtil.match(MatchUtil.java:37)
at org.globalbioticinteractions.nomer.cmd.CmdAppend.run(CmdAppend.java:20)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
at picocli.CommandLine.execute(CommandLine.java:2078)
at org.globalbioticinteractions.nomer.Nomer.run(Nomer.java:57)
at org.globalbioticinteractions.nomer.Nomer.main(Nomer.java:46)
Am I doing something wrong?
thanks.
You did nothing wrong . . . however, the zenodo folks have changed their url syntax, so you'd have to update the endpoints accordingly.
actually . . . did you run a nomer clean first? Also, what version are you using?
Yes I did nomer clean
nomer version
0.5.6
what version of nomer are you using?
0.5.6 right?
Ok, I'll try and reproduce. Just a minute.
I'm going home right now. When I reach there, I will try again...
thanks
Ok, am working on it.
I've created a new Nomer v0.5.7 with your updated GBIF backbone taxonomy.
Please confirm that you can now use the packaged GBIF version ok by closing the issue.
Would you be interested to learn how to do your own Nomer corpus and Nomer releases? I think it may be wise to spread the work a little.
Thank you @jhpoelen . I will test it and let you know. The gbif catalogue was built using your code at: https://github.com/jhpoelen/repackage-gbif-backbone
thanks
@jhpoelen
There is something wrong when installing the new version of nomer:
sudo sh -c '(curl -L https://github.com/globalbioticinteractions/nomer/releases/download/0.5.7/nomer.jar) > /usr/local/bin/nomer && chmod +x /usr/local/bin/nomer && nomer install-manpage' && nomer clean && nomer version
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 88.1M 100 88.1M 0 0 61.6M 0 0:00:01 0:00:01 --:--:-- 74.6M
sh: 1: nomer: Exec format error
The executable /usr/local/bin/nomer
is missing the header:
#!/usr/bin/env sh
#
@ 2>/dev/null # 2>nul & echo off & goto BOF
:
exec java -Xmx4G -XX:+UseG1GC $JAVA_OPTS -cp "$0" org.globalbioticinteractions.nomer.Nomer "$@"
exit
:BOF
@echo off
java -Xmx4G -XX:+UseG1GC %JAVA_OPTS% -cp "%~dpnx0" org.globalbioticinteractions.nomer.Nomer %*
exit /B %errorlevel%
Why?
@zedomel thanks for your message. Apologies for the nomer.jar . . . I omitted to prepend the .travis.jar.magic file to the nomer.jar using
cat .travis.jar.magic nomer/target/nomer-0.5.7-jar-with-dependencies.jar > nomer.jar
I've updated the artifact - please try again.
Hmm. Perhaps a good reason to automate this distribution process. . . . what do you think?
@zedomel also, please holler if you were able to use the updated nomer.jar v0.5.7 with the upgraded GBIF backbone you published.
Meanwhile, I figuring out how to allow for overrides - like the one you tried by blanking out the preston configuration.
Thanks for being patient.
@jhpoelen it worked. Thank you.
Hi @jhpoelen
I'm trying to create a new version of gbif backbone to use in
nomer
. I download the script in the repository https://doi.org/10.15468/39omei and fix some minor errors:file not found
error: `gbif-backbone-simple.txt.gz'I ran the script and it produce the expected files which I put in a new repository https://zenodo.org/record/6707049.
Then I update the
nomer.properties
file:But when I run:
echo -e "\tDunderbergia granulosa" | nomer append gbif -p nomer.properties
The following error is produced:
Looks that the file
gz:https://zenodo.org/record/6707049/files/gbif-backbone-by-id.tsv.gz!/gbif-backbone-by-id.tsv
was not found, but it exists and I can download it usingwget
How can I change
nomer
configuration to use this new version of GBIF backbone?thanks josé.