nomer clean
git clone https://github.com/globalbioticinteractions/taxon-graph-builder
cd taxon-graph-builder
nohup make & #
observed error
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jhpoelen/.cache/nomer/tmp/nomer644129068998147992.gz] done.
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceReadOnly - using cached [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jhpoelen/.cache/nomer/hash/sha256/b959e969ddf4114bd590ec1cdcf7ec572076bd46e2e28e2fee038a3f6d41b9fd/ace0cedb0aa2a691e55c45bdc95dda068d4a8bb1b4086decc3f2803987984fd3.gz]
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon cache of [file:/home/jhpoelen/.cache/nomer/wikidata_appended_taxon_ranks.tsv] building...
[main] INFO org.eol.globi.taxon.TaxonCacheService - cache with [107] items built in [0.0] s or [4115.4] items/s.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon cache of [file:/home/jhpoelen/.cache/nomer/wikidata_appended_taxon_ranks.tsv] built.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon map of [file:/home/jhpoelen/.cache/nomer/wikidata_appended_taxon_rank_links.tsv] building...
[main] INFO org.eol.globi.taxon.TaxonCacheService - cache with [4019] items built in [0.2] s or [24506.1] items/s.
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon map of [file:/home/jhpoelen/.cache/nomer/wikidata_appended_taxon_rank_links.tsv] built.
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 4.8% of 840 kB at 1.47 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 10.3% of 840 kB at 1.69 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 15.9% of 840 kB at 1.84 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 21.4% of 840 kB at 2.41 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 26.9% of 840 kB at 2.90 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 32.6% of 840 kB at 2.88 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 38.3% of 840 kB at 3.27 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 43.1% of 840 kB at 3.61 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 48.8% of 840 kB at 4.00 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 54.5% of 840 kB at 4.34 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 60.2% of 840 kB at 4.29 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 65.9% of 840 kB at 4.58 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 71.6% of 840 kB at 4.89 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 77.3% of 840 kB at 5.16 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 83.0% of 840 kB at 5.45 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 88.7% of 840 kB at 5.68 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 94.4% of 840 kB at 5.96 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 100.0% of 840 kB at 6.17 MB/s ETA: < 1 minute
[https://zenodo.org/recor...df104d4ba88a54972f9f49e] 100.0% of 840 kB at 6.17 MB/s completed in < 1 minute
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] at [/home/jhpoelen/.cache/nomer/tmp/nomer708058498259317527.gz] done.
java.lang.RuntimeException: failed to create matcher
at org.globalbioticinteractions.nomer.match.TermMatcherFactoryTaxonRanks.createTermMatcher(TermMatcherFactoryTaxonRanks.java:68)
at org.globalbioticinteractions.nomer.match.TermMatcherRegistry.termMatcherFor(TermMatcherRegistry.java:180)
at org.globalbioticinteractions.nomer.match.MatchUtil.lambda$resolveMatcher$0(MatchUtil.java:58)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1361)
at java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:126)
at java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:499)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:486)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.FindOps$FindOp.evaluateSequential(FindOps.java:152)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.findFirst(ReferencePipeline.java:531)
at org.globalbioticinteractions.nomer.match.MatchUtil.resolveMatcher(MatchUtil.java:63)
at org.globalbioticinteractions.nomer.match.MatchUtil.getTermMatcher(MatchUtil.java:50)
at org.globalbioticinteractions.nomer.cmd.CmdReplace.run(CmdReplace.java:21)
at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
at picocli.CommandLine.access$1300(CommandLine.java:145)
at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
at picocli.CommandLine.execute(CommandLine.java:2078)
at org.globalbioticinteractions.nomer.Nomer.run(Nomer.java:57)
at org.globalbioticinteractions.nomer.Nomer.main(Nomer.java:46)
Caused by: java.io.IOException: failed to access [https://query.wikidata.org/sparql?format=json&query=PREFIX%20rdfs:%20%3Chttp://www.w3.org/2000/01/rdf-schema%23%3E%0APREFIX%20bd:%20%3Chttp://www.bigdata.com/rdf%23%3E%0APREFIX%20wd:%20%3Chttp://www.wikidata.org/entity/%3E%0APREFIX%20wikibase:%20%3Chttp://wikiba.se/ontology%23%3E%0APREFIX%20wdt:%20%3Chttp://www.wikidata.org/prop/direct/%3E%0ASELECT%20?i%20?l%20WHERE%20%7B%0A%20%20?i%20wdt:P31%20wd:Q427626.%0A%20%20?i%20rdfs:label%20?l%0A%7D] in preston verse [
at org.globalbioticinteractions.nomer.match.ResourceServiceContentBased.retrieve(ResourceServiceContentBased.java:81)
at org.globalbioticinteractions.nomer.match.ResourceServiceFactoryImpl$1.retrieve(ResourceServiceFactoryImpl.java:37)
at org.globalbioticinteractions.nomer.match.TermMatcherContextCaching.retrieve(TermMatcherContextCaching.java:19)
at org.globalbioticinteractions.nomer.match.WikidataTaxonRankLoader.importTaxonRanks(WikidataTaxonRankLoader.java:47)
at org.globalbioticinteractions.nomer.match.TermMatcherFactoryTaxonRanks.createTermMatcher(TermMatcherFactoryTaxonRanks.java:54)
... 24 more
Caused by: org.apache.commons.io.FileExistsException: File element in parameter 'destFile' already exists: '/home/jhpoelen/.cache/nomer/hash/sha256/b959e969ddf4114bd590ec1cdcf7ec572076bd46e2e28e2fee038a3f6d41b9fd/ace0cedb0aa2a691e55c45bdc95dda068d4a8bb1b4086decc3f2803987984fd3.gz'
at org.apache.commons.io.FileUtils.requireAbsent(FileUtils.java:2688)
at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:2398)
at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:2376)
at org.globalbioticinteractions.nomer.match.ResourceServiceContentBased.retrieve(ResourceServiceContentBased.java:79)
... 28 more
make: *** [Makefile:112: target/taxonCache.tsv.gz] Error 1
Suspected root cause is that two instances of nomer try to build the same index, then trip over each other when storing an offline copy of the index source data.
steps to reproduce:
observed error