Closed jhpoelen closed 1 year ago
root cause was the inclusion of an older version of commons-codec , instead of the desired version commons-codec:commons-codec:1.15 .
After applying fix, the following (expected) output was generated:
$ echo -e "\tHomo sapiens" | nomer append plazi
[main] INFO org.globalbioticinteractions.nomer.match.PlaziService - Indexing Plazi treatments ...
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - using local Preston data dir: [/home/jorrit/.cache/nomer/data]
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://github.com/plazi/treatments-rdf/archive/master.zip] at [/home/jorrit/.cache/nomer/tmp/nomer4573412192819626404.gz]...
[https://zenodo.org/recor...6c90381190f324c1ac143f2] 100.0% of 11 kB at 1.40 MB/s completed in < 1 minute
[https://zenodo.org/recor...000c74e4e8fdeb937e29b1d] 100.0% of 34 kB at 8.31 MB/s completed in < 1 minute
[https://zenodo.org/recor...693d16efda23865d6cbf303] 100.0% of 767 MB at 3.76 MB/s completed in 3 minute(s)
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceContentBased - caching [https://github.com/plazi/treatments-rdf/archive/master.zip] at [/home/jorrit/.cache/nomer/tmp/nomer4573412192819626404.gz] done.
[main] INFO org.globalbioticinteractions.nomer.match.ResourceServiceReadOnly - using cached [https://github.com/plazi/treatments-rdf/archive/master.zip] at [/home/jorrit/.cache/nomer/hash/sha256/b3742bf43d9da0a8ed5522659199f47d68d31aaf46c90381190f324c1ac143f2/b176164ee1afeaab4d30171fea98c6f9aa2dc6dbbfdcbeab740f19b260e292ed.gz]
[main] WARN org.apache.jena.riot - [line: 58, col: 1 ] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Tatargina_picta_Walker_[1865] 1864> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 213, col: 30] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Tatargina_picta_Walker_[1865] 1864> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 24, col: 1 ] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Aphyocharacinae]_Eigenmann_1909> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 25, col: 22] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/Aphyocharacinae]> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 180, col: 1 ] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/Aphyocharacinae]> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 320, col: 16] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Aphyocharacinae]_Eigenmann_1909> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 323, col: 23] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/[unassigned]_Caenogastropoda> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 332, col: 1 ] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/[unassigned]_Caenogastropoda> Code: 0/ILLEGAL_CHARACTER in PATH: The character violates the grammar rules for URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 107, col: 245] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358984> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 107, col: 341] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358985> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 107, col: 437] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358987> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 107, col: 533] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358988> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 107, col: 629] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/SMF 358986> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 134, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358984> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 142, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358985> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 145, col: 25] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 146, col: 26] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 150, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358987> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 158, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/INDEX19, SMF 358988> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 166, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D08794FFD1FFEBECE2968258F6FF38/SMF 358986> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 169, col: 25] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 170, col: 26] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 36, col: 1 ] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Indolestes_sp_"o"_Fraser_1922> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
[main] WARN org.apache.jena.riot - [line: 37, col: 22] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/Indolestes_sp_"o"> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
[main] WARN org.apache.jena.riot - [line: 84, col: 1 ] Bad IRI: <http://taxon-name.plazi.org/id/Animalia/Indolestes_sp_"o"> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
[main] WARN org.apache.jena.riot - [line: 125, col: 20] Bad IRI: <http://taxon-concept.plazi.org/id/Animalia/Indolestes_sp_"o"_Fraser_1922> Code: 4/UNWISE_CHARACTER in PATH: The character matches no grammar rules of URIs/IRIs. These characters are permitted in RDF URI References, XML system identifiers, and XML Schema anyURIs.
[main] WARN org.apache.jena.riot - [line: 115, col: 23] Bad IRI: <http://treatment.plazi.org/id/03D2AB06FFCE5139F8EC2395DCE9E30A/MHNC 13906> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 115, col: 105] Bad IRI: <http://treatment.plazi.org/id/03D2AB06FFCE5139F8EC2395DCE9E30A/MHNC 13947, MHNC 8270, MHNC 13933, MHNC 13935> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 118, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D2AB06FFCE5139F8EC2395DCE9E30A/MHNC 13906> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 121, col: 25] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 122, col: 26] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 126, col: 1 ] Bad IRI: <http://treatment.plazi.org/id/03D2AB06FFCE5139F8EC2395DCE9E30A/MHNC 13947, MHNC 8270, MHNC 13933, MHNC 13935> Spaces are not legal in URIs/IRIs.
[main] WARN org.apache.jena.riot - [line: 129, col: 25] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 130, col: 26] Lexical form '' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 132, col: 25] Lexical form '28.6160000°' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 133, col: 26] Lexical form '032.2931667°' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 135, col: 25] Lexical form '−21.671' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 149, col: 25] Lexical form '−12.068' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 159, col: 25] Lexical form '−29.550' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 293, col: 25] Lexical form '−21,668' not valid for datatype XSD decimal
[main] WARN org.apache.jena.riot - [line: 294, col: 26] Lexical form '34,847' not valid for datatype XSD decimal
[main] INFO org.globalbioticinteractions.nomer.match.PlaziService - cache with [1451398] items built in [1321.4] s or [1098.4] items/s.
[main] INFO org.globalbioticinteractions.nomer.match.PlaziService - Indexing Plazi treatments complete.
Homo sapiens SAME_AS http://taxon-concept.plazi.org/id/Animalia/Homo_sapiens_Linnaeus_1758 Homo sapiens species Animalia | Chordata | Mammalia | Primates | Hominidae | Homo | Homo sapiens kingdom | phylum | class | order | family | genus | species http://taxon-concept.plazi.org/id/Animalia/Homo_sapiens_Linnaeus_1758
Homo sapiens SAME_AS http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0 http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0 http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0 http://treatment.plazi.org/id/34AC185C73C41EA124EEED97C898FBC0
Homo sapiens SAME_AS doi:10.5962/bhl.title.542 doi:10.5962/bhl.title.542 doi:10.5962/bhl.title.542 https://doi.org/10.5962/bhl.title.542
Issues https://github.com/plazi/community/issues/182 and https://github.com/plazi/treatments-rdf/issues/8 were uncovered during repair of Nomer's Plazi treatment indexer.
issue addressed in https://github.com/globalbioticinteractions/nomer/releases/tag/0.4.4
when running:
the following exception is seen: