apache / jena

Apache Jena
https://jena.apache.org/
Apache License 2.0
1.08k stars 643 forks source link

ResultSetException: Datatype is rdf:langString but no language given #2555

Open averza opened 6 days ago

averza commented 6 days ago

Version

5.0.0

What happened?

I am running a query on the dbpedia.org SPARQL endpoint. It fails while iterating the result set. Works fine in Jena 4.1.0. Perhaps the data is invalid on dbpedia.org, please investigate.

http://dbpedia.org/sparql SELECT DISTINCT * WHERE { <http://dbpedia.org/resource/OpenLink_Software> ?p ?o } LIMIT 100

Relevant output and stacktrace

Exception
=========
org.apache.jena.sparql.resultset.ResultSetException: Datatype is rdf:langString but no language given
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:214)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:66)
    at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.nextFromDelegate(RowSetBuffered.java:165)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.moveToNext(RowSetBuffered.java:152)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.moveToNext(RowSetBuffered.java:43)
    at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63)
    at org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:75)
    at MyTestProgram
Caused by: org.apache.jena.sparql.resultset.ResultSetException: Datatype is rdf:langString but no language given
    at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.moveToNext(IteratorRsJSON.java:75)
    at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.computeNextActual(RowSetJSONStreaming.java:225)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:210)
    ... 19 more
Caused by: org.apache.jena.shared.JenaException: Datatype is rdf:langString but no language given
    at org.apache.jena.graph.NodeFactory.createLiteral(NodeFactory.java:240)
    at org.apache.jena.graph.NodeFactory.createLiteral(NodeFactory.java:177)
    at org.apache.jena.sparql.util.NodeFactoryExtra.createLiteralNode(NodeFactoryExtra.java:100)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.parseOneTerm(RowSetJSONStreaming.java:404)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.parseBinding(RowSetJSONStreaming.java:358)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming$RsJsonEltEncoderDft.newBindingElt(RowSetJSONStreaming.java:571)
    at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming$RsJsonEltEncoderDft.newBindingElt(RowSetJSONStreaming.java:541)
    at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.computeNextActual(IteratorRsJSON.java:136)
    at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.moveToNext(IteratorRsJSON.java:72)
    ... 22 more

Are you interested in making a pull request?

None

namedgraph commented 6 days ago

I tried this

curl --url-query 'query=SELECT DISTINCT * WHERE { <http://dbpedia.org/resource/OpenLink_Software> ?p ?o } LIMIT 100' 'http://dbpedia.org/sparql' -L | grep langString

which gives

   <binding name="o"><literal datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#langString">~50</literal></binding>

Looks suspect since the rdf:langString datatype is normally implied by @xml:lang.

On Mon, Jul 1, 2024 at 8:38 AM averza @.***> wrote:

Version

5.0.0 What happened?

I am running a query on the dbpedia.org SPARQL endpoint. It fails while iterating the result set. Works fine in Jena 4.1.0. Perhaps the data is invalid on dbpedia.org, please investigate.

http://dbpedia.org/sparql SELECT DISTINCT * WHERE { http://dbpedia.org/resource/OpenLink_Software ?p ?o } LIMIT 100 Relevant output and stacktrace

Exception

org.apache.jena.sparql.resultset.ResultSetException: Datatype is rdf:langString but no language given at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:214) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:66) at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63) at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.nextFromDelegate(RowSetBuffered.java:165) at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.moveToNext(RowSetBuffered.java:152) at org.apache.jena.riot.rowset.rw.rs_json.RowSetBuffered.moveToNext(RowSetBuffered.java:43) at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63) at org.apache.jena.sparql.engine.ResultSetStream.hasNext(ResultSetStream.java:75) at MyTestProgram Caused by: org.apache.jena.sparql.resultset.ResultSetException: Datatype is rdf:langString but no language given at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.moveToNext(IteratorRsJSON.java:75) at org.apache.jena.atlas.iterator.IteratorSlotted.hasNext(IteratorSlotted.java:63) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.computeNextActual(RowSetJSONStreaming.java:225) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.moveToNext(RowSetJSONStreaming.java:210) ... 19 more Caused by: org.apache.jena.shared.JenaException: Datatype is rdf:langString but no language given at org.apache.jena.graph.NodeFactory.createLiteral(NodeFactory.java:240) at org.apache.jena.graph.NodeFactory.createLiteral(NodeFactory.java:177) at org.apache.jena.sparql.util.NodeFactoryExtra.createLiteralNode(NodeFactoryExtra.java:100) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.parseOneTerm(RowSetJSONStreaming.java:404) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming.parseBinding(RowSetJSONStreaming.java:358) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming$RsJsonEltEncoderDft.newBindingElt(RowSetJSONStreaming.java:571) at org.apache.jena.riot.rowset.rw.rs_json.RowSetJSONStreaming$RsJsonEltEncoderDft.newBindingElt(RowSetJSONStreaming.java:541) at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.computeNextActual(IteratorRsJSON.java:136) at org.apache.jena.riot.rowset.rw.rs_json.IteratorRsJSON.moveToNext(IteratorRsJSON.java:72) ... 22 more

Are you interested in making a pull request?

None

— Reply to this email directly, view it on GitHub https://github.com/apache/jena/issues/2555, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGPM5RB4UON67UMX2HRW53ZKD2OFAVCNFSM6AAAAABKE5ANDGVHI2DSMVQWIX3LMV43ASLTON2WKOZSGM4DEOJXGMZTMMY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

afs commented 6 days ago

Perhaps the data is invalid on dbpedia.org, please investigate.

Hi @averza -- could you report this issue to OpenLink please?

Even if the data is wrong, the server should not be sending results that do not conform to the expectations of results formats. (The sparql-results+json is also wrong).

Putting in a lang tag would be altering the data in a way the application can not detect so this is not a Jena issue.

The actual status of "foo"^^rdf:langString is dubious. The current working group -- https://github.com/w3c/rdf-turtle/issues/37

Jena can have a fix to cope although that is moving the problem of the expectation onto the application.

afs commented 6 days ago

The Jena4->Jena5 change is NodeFactory.createLiteral(NodeFactory.java:240) and so affects all formats.