Closed zedomel closed 1 year ago
hey @zedomel - apologies for the bug . . . it appears that the wikidata resources have not yet been added (or incorrectly retrieve from) Nomer's Corpus of Taxonomic Resource.
A workaround may be to disable the preston integration by emptying their properties -
e.g., by default you have:
nomer properties | grep preston
nomer.preston.dir=
nomer.preston.remotes=https://zenodo.org/record/7196029/files
nomer.preston.version=hash://sha256/b3742bf43d9da0a8ed5522659199f47d68d31aaf46c90381190f324c1ac143f2
and if you use your own properties file with:
nomer.preston.dir=
nomer.preston.remotes=
nomer.preston.version=
the resources are retrieved from an unversioned copy, just like what nomer did prior to the Preston integration.
Please let me know if this workaround helps you more forward.
Meanwhile, I'll try and find the root of the issue and make globi-taxon-rank work in the Prestonocene.
In preparing a new release of Nomer to address this globi-taxon-rank issue, I was able to generate:
$ echo -e "\tsoort" | nomer append --properties my.properties globi-taxon-rank
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceReadOnly - using cached [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jorrit/.cache/nomer/ace0cedb0aa2a691e55c45bdc95dda068d4a8bb1b4086decc3f2803987984fd3.gz]
soort SAME_AS WD:Q7432 species specie @it | loài @vi | espècia @oc | druh @cs | art @sv | art @nn | šlaajâ @smn | šlaajj @sms | tür @tr | Espesye @war | இனம் @ta | spesies @id | Aart @lb | 種 @ja | 种 @zh | 种 @zh-hans | 種 @zh-hant | espècie @ca | art @nb | Зүйл @mn | jinsi @ha | laji @fi | especie @gl | Тĕс @cv | вид @bg | specie @ro | ପ୍ରଜାତି @or | намуд @tg | Specie @rup | სახეობა @ka | Druh @sk | Spesie @af | Especie @an | نوع (بيولوجيا) @ary | প্ৰজাতি @as | especie @ast | Bioloji növ @az | Төр @ba | spésiés @ban | Oart @bar | Species @bcl | біялагічны від @be | प्रजाति @bho | প্রজাতি @bn | Spesad @br | Биологин тайпа @ce | ᏧᎾᏓᎴᎿ ᎠᏁᎯ @chr | جۆرە @ckb | spezia @co | tür @crh | rhywogaeth @cy | tewro biyolocik @diq | Družyna (biologija) @dsb | प्रजाति @dty | είδος @el | species @en-ca | liik @et | espezie @eu | Especii @ext | Slach @frr | soarte @fy | Lèspès @gcr | Juehegua @gn | જાતિ @gu | Species @hif | vrsta @hr | družina @hsb | Espès @ht | faj @hu | specie @ia | sebbangan @ilo | Биологен кеп @inh | Spiishi @jam | jutsi @jbo | spesies @jv | talmest @kab | Түр @kk | Биология тюрлю @krc | زٲژ @ks | Cure @ku | Түр @ky | Spéce @lmo | ແອສະແປດ @lo | Rūšis @lt | suga @lv | momo @mi | вид @mk | ജീവജാതി @ml | spesies @ms | سڤيسيس @ms-arab | speċi @mt | မျိုးစိတ် @my | تی @mzn | Chéng (hun-lūi-ha̍k) @nan | Specia @nap | Oort (Biologie) @nds | Soort @nds-nl | प्रजाति @ne | प्रजाति @new | ਪ੍ਰਜਾਤੀ @pa | soort @nl | specio @eo | species @la | گونه @fa | art @da | 종 @ko | gatunek @pl | від @be-tarask | вид @uk | Tur @uz | जाति @hi | ಜಾತಿ @kn | espèce @fr | Vrsta @sh | species @en | Art @de | نوع @ar | Art @de-ch | Art @gsw | speiceas @ga | Tegund @is | вид @ru | ߛߌߦߊ @nqo | species @en-gb | տեսակ @hy | espécie @pt | מין @he | especie @es | Espesye @pam | Spece @pms | توکمونه @ps | espécie @pt-br | Rikch'aq @qu | Вид @rue | Көрүҥ @sah | specia @scn | species @sco | šládja @se | විශේෂය @si | Nuucyada dhirta @so | Lloji @sq | врста @sr | Spésiés @su | Spishi @sw | జాతి @te | намуд @tg-cyrl | namud @tg-latn | สปีชีส์ @th | Biologik görnüş @tk | Espesye @tl | төр @tt | төр @tt-cyrl | tör @tt-latn | نوع @ur | Spece @vec | erik @vep | Sôorte @vls | Indje @wa | 物种 @wuu | გვარობა @xmf | זגאל @yi | 物種 @yue | 物种 @zh-cn | 物種 @zh-tw | vrsta @sl | species @dag | Tawsit @shi | نوع @arz | Eghen @kw | 物種 @zh-hk | vrsta @bs | espesie @pap | especie @pap-aw | spèiseas @gd | Rūšės @sgs | cor @ku-latn | پرجاتی @pnb species WD:Q7432 https://www.wikidata.org/wiki/Q7432
with my.properties
nomer.preston.dir=
nomer.preston.remotes=
nomer.preston.version=
on upgrading nomer to use Nomer Corpus of Taxonomic Resource v0.8,
Poelen, Jorrit H. (2022). Nomer Corpus of Taxonomic Resources hash://sha256/a0b5570204881a594cf0cca0d4b50c0ddca6f91c2086541516138a400620fb5b hash://md5/d05224a5c2933cd16f092ad28549f3e2 (0.8) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7343032
the following result was obtained:
$ echo -e "\tsoort" | nomer append globi-taxon-rank
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - using local Preston data dir: [/home/jorrit/.cache/nomer/data]
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jorrit/.cache/nomer/tmp/nomer9488616572787809862.gz]...
[https://zenodo.org/recor...086541516138a400620fb5b] 100.0% of 6 kB at 1.19 MB/s completed in < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 5.6% of 840 kB at 0.23 MB/s[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 11.3% of 840 kB at 0.23 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 17.0% of 840 kB at 0.34 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 22.7% of 840 kB at 0.30 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 28.4% of 840 kB at 0.29 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 34.2% of 840 kB at 0.34 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 39.9% of 840 kB at 0.32 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 45.6% of 840 kB at 0.31 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 51.3% of 840 kB at 0.30 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 57.0% of 840 kB at 0.33 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 62.7% of 840 kB at 0.31 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 68.4% of 840 kB at 0.29 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 74.2% of 840 kB at 0.28 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 79.9% of 840 kB at 0.30 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 85.6% of 840 kB at 0.30 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 91.3% of 840 kB at 0.31 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 97.0% of 840 kB at 0.31 MB/[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 100.0% of 840 kB at 0.32 MB/s completed in < 1 minute
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jorrit/.cache/nomer/tmp/nomer9488616572787809862.gz] done.
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceReadOnly - using cached [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jorrit/.cache/nomer/hash/sha256/a0b5570204881a594cf0cca0d4b50c0ddca6f91c2086541516138a400620fb5b/ace0cedb0aa2a691e55c45bdc95dda068d4a8bb1b4086decc3f2803987984fd3.gz]
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon cache of [file:/home/jorrit/.cache/nomer/wikidata_appended_taxon_ranks.tsv] building...
[main] INFO org.eol.globi.taxon.TaxonCacheService - cache with [107] items built in [0.0] s or [4863.6] items/s.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon cache of [file:/home/jorrit/.cache/nomer/wikidata_appended_taxon_ranks.tsv] built.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon map of [file:/home/jorrit/.cache/nomer/wikidata_appended_taxon_rank_links.tsv] building...
[main] INFO org.eol.globi.taxon.TaxonCacheService - cache with [4019] items built in [0.2] s or [26440.8] items/s.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon map of [file:/home/jorrit/.cache/nomer/wikidata_appended_taxon_rank_links.tsv] built.
soort SAME_AS WD:Q7432 species specie @it | loài @vi | espècia @oc | druh @cs | art @sv | art @nn | šlaajâ @smn | šlaajj @sms | tür @tr | Espesye @war | இனம் @ta | spesies @id | Aart @lb | 種 @ja | 种 @zh | 种 @zh-hans | 種 @zh-hant | espècie @ca | art @nb | Зүйл @mn | jinsi @ha | laji @fi | especie @gl | Тĕс @cv | вид @bg | specie @ro | ପ୍ରଜାତି @or | намуд @tg | Specie @rup | სახეობა @ka | Druh @sk | Spesie @af | Especie @an | نوع (بيولوجيا) @ary | প্ৰজাতি @as | especie @ast | Bioloji növ @az | Төр @ba | spésiés @ban | Oart @bar | Species @bcl | біялагічны від @be | प्रजाति @bho | প্রজাতি @bn | Spesad @br | Биологин тайпа @ce | ᏧᎾᏓᎴᎿ ᎠᏁᎯ @chr | جۆرە @ckb | spezia @co | tür @crh | rhywogaeth @cy | tewro biyolocik @diq | Družyna (biologija) @dsb | प्रजाति @dty | είδος @el | species @en-ca | liik @et | espezie @eu | Especii @ext | Slach @frr | soarte @fy | Lèspès @gcr | Juehegua @gn | જાતિ @gu | Species @hif | vrsta @hr | družina @hsb | Espès @ht | faj @hu | specie @ia | sebbangan @ilo | Биологен кеп @inh | Spiishi @jam | jutsi @jbo | spesies @jv | talmest @kab | Түр @kk | Биология тюрлю @krc | زٲژ @ks | Cure @ku | Түр @ky | Spéce @lmo | ແອສະແປດ @lo | Rūšis @lt | suga @lv | momo @mi | вид @mk | ജീവജാതി @ml | spesies @ms | سڤيسيس @ms-arab | speċi @mt | မျိုးစိတ် @my | تی @mzn | Chéng (hun-lūi-ha̍k) @nan | Specia @nap | Oort (Biologie) @nds | Soort @nds-nl | प्रजाति @ne | प्रजाति @new | ਪ੍ਰਜਾਤੀ @pa | soort @nl | specio @eo | species @la | گونه @fa | art @da | 종 @ko | gatunek @pl | від @be-tarask | вид @uk | Tur @uz | जाति @hi | ಜಾತಿ @kn | espèce @fr | Vrsta @sh | species @en | Art @de | نوع @ar | Art @de-ch | Art @gsw | speiceas @ga | Tegund @is | вид @ru | ߛߌߦߊ @nqo | species @en-gb | տեսակ @hy | espécie @pt | מין @he | especie @es | Espesye @pam | Spece @pms | توکمونه @ps | espécie @pt-br | Rikch'aq @qu | Вид @rue | Көрүҥ @sah | specia @scn | species @sco | šládja @se | විශේෂය @si | Nuucyada dhirta @so | Lloji @sq | врста @sr | Spésiés @su | Spishi @sw | జాతి @te | намуд @tg-cyrl | namud @tg-latn | สปีชีส์ @th | Biologik görnüş @tk | Espesye @tl | төр @tt | төр @tt-cyrl | tör @tt-latn | نوع @ur | Spece @vec | erik @vep | Sôorte @vls | Indje @wa | 物种 @wuu | გვარობა @xmf | זגאל @yi | 物種 @yue | 物种 @zh-cn | 物種 @zh-tw | vrsta @sl | species @dag | Tawsit @shi | نوع @arz | Eghen @kw | 物種 @zh-hk | vrsta @bs | espesie @pap | especie @pap-aw | spèiseas @gd | Rūšės @sgs | cor @ku-latn | پرجاتی @pnb species WD:Q7432 https://www.wikidata.org/wiki/Q7432
issue appears to be resolved in Nomer v0.4.5 .
@zedomel if issues remain, please do comment or open a new issue.
Hi @jhpoelen
I'm trying to use
globi-taxon-rank
nomer
's matcher (version0.4.0
) following theMakefile
of taxon-graph-builder but it is producing the follow expcetion:The file
/home/jose/.cache/nomer/wikidata_appended_taxon_ranks.tsv
exists:head -n5 /home/jose/.cache/nomer/wikidata_appended_taxon_ranks.tsv
:When I use
nomer
version0.2.6
(prior topreston
integration I presume) it works! Looks like the newer versions ofnomer
does not work with local files (or files not in remotepreston
graph ???)I also tried to use
nomer
translate-names
matcher with local file but it did not work (IOException
):Could you help me with this?
Thanks.