globalbioticinteractions / nomer

maps identifiers and names to other identifiers and names
GNU General Public License v3.0
18 stars 3 forks source link

unexpected error on matching Homo sapiens against GloBI taxon graph #125

Closed jhpoelen closed 1 year ago

jhpoelen commented 1 year ago
$ echo -e "\tHomo sapiens" | nomer append globi
[main] INFO org.eol.globi.taxon.TaxonCacheService - local taxon cache of [https://zenodo.org/record/6394935/files/taxonCache.tsv.gz] building...
java.lang.IllegalStateException: failed to instantiate taxonCache: [failed to open resource [https://zenodo.org/record/6394935/files/taxonCache.tsv.gz]]
    at org.eol.globi.taxon.TaxonCacheService.initTaxonCache(TaxonCacheService.java:226)
    at org.eol.globi.taxon.TaxonCacheService.init(TaxonCacheService.java:146)
    at org.eol.globi.taxon.TaxonCacheService.lazyInit(TaxonCacheService.java:140)
    at org.eol.globi.taxon.TaxonCacheService.match(TaxonCacheService.java:250)
    at org.eol.globi.service.TermMatcherHierarchical.match(TermMatcherHierarchical.java:57)
    at org.globalbioticinteractions.nomer.util.AppendingRowHandler.onRow(AppendingRowHandler.java:36)
    at org.globalbioticinteractions.nomer.match.MatchUtil.apply(MatchUtil.java:85)
    at org.globalbioticinteractions.nomer.match.MatchUtil.match(MatchUtil.java:37)
    at org.globalbioticinteractions.nomer.cmd.CmdAppend.run(CmdAppend.java:20)
    at picocli.CommandLine.executeUserObject(CommandLine.java:1939)
    at picocli.CommandLine.access$1300(CommandLine.java:145)
    at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2358)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2352)
    at picocli.CommandLine$RunLast.handle(CommandLine.java:2314)
    at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
    at picocli.CommandLine$RunLast.execute(CommandLine.java:2316)
    at picocli.CommandLine.execute(CommandLine.java:2078)
    at org.globalbioticinteractions.nomer.Nomer.run(Nomer.java:57)
    at org.globalbioticinteractions.nomer.Nomer.main(Nomer.java:46)
Caused by: java.io.IOException: failed to open resource [https://zenodo.org/record/6394935/files/taxonCache.tsv.gz]
    at org.eol.globi.util.ResourceServiceDataDir.retrieve(ResourceServiceDataDir.java:19)
    at org.eol.globi.util.ResourceServiceClasspathResource.retrieve(ResourceServiceClasspathResource.java:32)
    at org.eol.globi.util.ResourceServiceGzipAware.retrieve(ResourceServiceGzipAware.java:20)
    at org.eol.globi.util.ResourceServiceLocal.retrieve(ResourceServiceLocal.java:33)
    at org.eol.globi.service.CacheServiceUtil.createBufferedReader(CacheServiceUtil.java:15)
    at org.eol.globi.taxon.TaxonCacheService$3.<init>(TaxonCacheService.java:302)
    at org.eol.globi.taxon.TaxonCacheService.taxonCacheIterator(TaxonCacheService.java:301)
    at org.eol.globi.taxon.TaxonCacheService.initTaxonCache(TaxonCacheService.java:220)
    ... 18 more
WARNING: An illegal reflective access operation has occurred
jhpoelen commented 1 year ago

related to https://github.com/bio-guoda/preston/issues/202

jhpoelen commented 1 year ago

after upgrade to preston v0.5.1 and glob libs v0.24.6, the following result is produced:

echo -e "\tHomo sapiens" | nomer append globi 

yields -


    Homo sapiens    SAME_AS GBIF:2436436    Homo sapiens        species     Animalia | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens GBIF:1 | GBIF:44 | GBIF:359 | GBIF:798 | GBIF:5483 | GBIF:2436435 | GBIF:2436436    kingdom | phylum | class | order | family | genus | species     http://eol.org/pages/327955
    Homo sapiens    SAME_AS http://taxon-concept.plazi.org/id/Animalia/Homo_sapiens_Linnaeus_1758   Homo sapiens                Animalia | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens     kingdom | phylum | class | order | family | genus | species     http://taxon-concept.plazi.org/id/Animalia/Homo_sapiens_Linnaeus_1758
    Homo sapiens    SAME_AS http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0  http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0              http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0              http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0
    Homo sapiens    SAME_AS IRMNG:10857762  Homo sapiens        species     Animalia | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens IRMNG:11 | IRMNG:148 | IRMNG:1310 | IRMNG:11338 | IRMNG:104701 | IRMNG:1035772 | IRMNG:10857762 kingdom | phylum | class | order | family | genus | species     http://eol.org/pages/327955
    Homo sapiens    SAME_AS ITIS:180092 Homo sapiens        species     Animalia | Bilateria | Deuterostomia | Chordata | Vertebrata | Gnathostomata | Tetrapoda | Mammalia | Theria | Eutheria | Primates | Haplorrhini | Simiiformes | Hominoidea | Hominidae | Homininae | Homo | Homo sapiens   ITIS:202423 | ITIS:914154 | ITIS:914156 | ITIS:158852 | ITIS:331030 | ITIS:914179 | ITIS:914181 | ITIS:179913 | ITIS:179916 | ITIS:179925 | ITIS:180089 | ITIS:943773 | ITIS:943778 | ITIS:943782 | ITIS:180090 | ITIS:943805 | ITIS:180091 | ITIS:180092   kingdom | subkingdom | infrakingdom | phylum | subphylum | infraphylum | superclass | class | subclass | infraclass | order | suborder | infraorder | superfamily | family | subfamily | genus | species        http://eol.org/pages/327955
    Homo sapiens    SAME_AS WORMS:1455977   Homo sapiens        species     Biota | Animalia | Chordata | Vertebrata | Gnathostomata | Tetrapoda | Mammalia | Primates | Hominidae | Homo | Homo sapiens    WORMS:1 | WORMS:2 | WORMS:1821 | WORMS:146419 | WORMS:1828 | WORMS:1831 | WORMS:1837 | WORMS:1455974 | WORMS:1455975 | WORMS:1455976 | WORMS:1455977    null | kingdom | phylum | subphylum | infraphylum | superclass | class | order | family | genus | species       https://www.marinespecies.org/aphia.php?p=taxdetails&id=1455977
    Homo sapiens    SAME_AS doi:10.5281/zenodo.3917332  doi:10.5281/zenodo.3917332              doi:10.5281/zenodo.3917332              https://doi.org/10.5281/zenodo.3917332
    Homo sapiens    SAME_AS EOL:327955  Homo sapiens        species إنسان @ar | Insan @az | човешки @bg | মানবীয় @bn | Ljudsko biće @bs | Humà @ca | Muž @cs | Menneske @da | Mensch @de | ανθρώπινο ον @el | human @en | Humano @es | Gizakiaren @eu | Ihminen @fi | Homme @fr | Mutum @ha | אנושי @he | մարդու @hy | Umano @it | ヒト @ja | ადამიანის @ka | Homo @la | žmogaus @lt | Om @mo | Mens @nl | Òme @oc | Człowiek rozumny @pl | Om @ro | Человек разумный @ru | Qenie Njerëzore @sq | மனிதன் @ta | మానవుడు @te | Aadmi @ur | umuntu @zu |    Animalia | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens EOL:1 | EOL:694 | EOL:1642 | EOL:1645 | EOL:1653 | EOL:42268 | EOL:327955   kingdom | phylum | class | order | family | genus | species     http://eol.org/pages/327955
    Homo sapiens    SAME_AS WD:Q15978631    Homo sapiens        species     | Homo sapiens  WD:Q171283 | WD:Q15978631   null | species      https://www.wikidata.org/wiki/Q15978631