Open enridaga opened 3 years ago
Another similar case, when scraping a web page:
[main] ERROR com.github.spiceh2020.sparql.anything.engine.FacadeXOpExecutor - An error occurred
java.io.IOException: java.net.URISyntaxException: Illegal character in fragment at index 29: http://www.w3.org/1999/xhtml#"
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.triplify(HTMLTriplifier.java:108)
at com.github.spiceh2020.sparql.anything.engine.FacadeXOpExecutor.triplify(FacadeXOpExecutor.java:265)
at com.github.spiceh2020.sparql.anything.engine.FacadeXOpExecutor.getDatasetGraph(FacadeXOpExecutor.java:138)
at com.github.spiceh2020.sparql.anything.engine.FacadeXOpExecutor.execute(FacadeXOpExecutor.java:170)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.visit(ExecutionDispatch.java:211)
at org.apache.jena.sparql.algebra.op.OpService.visit(OpService.java:56)
at org.apache.jena.sparql.engine.main.ExecutionDispatch.exec(ExecutionDispatch.java:46)
at org.apache.jena.sparql.engine.main.OpExecutor.exec(OpExecutor.java:118)
at org.apache.jena.sparql.engine.main.OpExecutor.execute(OpExecutor.java:89)
at org.apache.jena.sparql.engine.main.QC.execute(QC.java:52)
at com.github.spiceh2020.sparql.anything.engine.FacadeXOpExecutor$1.nextStage(FacadeXOpExecutor.java:210)
at org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.makeNextStage(QueryIterRepeatApply.java:108)
at org.apache.jena.sparql.engine.iterator.QueryIterRepeatApply.hasNextBinding(QueryIterRepeatApply.java:65)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:38)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at org.apache.jena.sparql.engine.iterator.QueryIteratorWrapper.hasNextBinding(QueryIteratorWrapper.java:38)
at org.apache.jena.sparql.engine.iterator.QueryIteratorBase.hasNext(QueryIteratorBase.java:114)
at org.apache.jena.atlas.iterator.Iter$2.hasNext(Iter.java:347)
at org.apache.jena.ext.com.google.common.collect.Iterators$ConcatenatedIterator.getTopMetaIterator(Iterators.java:1312)
at org.apache.jena.ext.com.google.common.collect.Iterators$ConcatenatedIterator.hasNext(Iterators.java:1328)
at org.apache.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:219)
at org.apache.jena.sparql.engine.QueryExecutionBase.execConstruct(QueryExecutionBase.java:207)
at com.github.spiceh2020.sparql.anything.cli.SPARQLAnything.executeQuery(SPARQLAnything.java:131)
at com.github.spiceh2020.sparql.anything.cli.SPARQLAnything.main(SPARQLAnything.java:542)
Caused by: java.net.URISyntaxException: Illegal character in fragment at index 29: http://www.w3.org/1999/xhtml#"
at java.base/java.net.URI$Parser.fail(URI.java:2938)
at java.base/java.net.URI$Parser.checkChars(URI.java:3109)
at java.base/java.net.URI$Parser.parse(URI.java:3153)
at java.base/java.net.URI.<init>(URI.java:623)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:139)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.populate(HTMLTriplifier.java:151)
at com.github.spiceh2020.sparql.anything.html.HTMLTriplifier.triplify(HTMLTriplifier.java:106)
... 24 more
Maybe we should implement a strategy so that the resulting IRI is first evaluated and then if an exception occurs, the string is URL-encoded in some way. However, this should already be happening in the Triplifier, so the problem may be limited to the HTML Triplifier using his own URI building code.
Maybe this is not still the case? @enridaga, do you remember the webpage raising the error?
An exception occurs with details: