ruby-rdf / sparql-client

SPARQL client for Ruby.
http://rubygems.org/gems/sparql-client
The Unlicense
112 stars 58 forks source link

Different results between sparql-client and linkeddata gems #80

Closed workergnome closed 7 years ago

workergnome commented 7 years ago

tl;dr. summary:

Using the sparql-client gem, the results of queries with embedded quotes differ from when using the linkeddata gem.

Expected Results:

Actual Results:

In depth:

When I run the following code:

require 'linkeddata'
# require 'sparql/client'

sparql = SPARQL::Client.new("http://data.americanartcollaborative.org/sparql")
uri = RDF.URI("http://data.crystalbridges.org/object/2258")
label = RDF.URI("http://www.w3.org/2000/01/rdf-schema#label")

query = sparql.construct([uri, label, :o]).where([uri, label, :o])
query.each_statement {|s| puts s.inspect}

I get the results I expect:

#<RDF::Statement:0x3fdd71454f78(<http://data.crystalbridges.org/object/2258> <http://www.w3.org/2000/01/rdf-schema#label> "Bison-Dance of the Mandan Indians in front of their Medicine Lodge in Mih-Tuta-Hankush" .)>
#<RDF::Statement:0x3fdd71440a78(<http://data.crystalbridges.org/object/2258> <http://www.w3.org/2000/01/rdf-schema#label> "From \"Voyage dans l’intérieur de l’Amérique du Nord, executé pendant les années 1832, 1833 et 1834, par le prince Maximilien de Wied-Neuwied\" (Paris & Coblenz, 1839-1843)" .)>

but when I comment out the linkeddata gem and instead use just the sparql-query gem the results with embedded quotes no longer work, and my results look like:

#<RDF::Statement:0x3fd7f58e2ee0(<http://data.crystalbridges.org/object/2258> <http://www.w3.org/2000/01/rdf-schema#label> "Bison-Dance of the Mandan Indians in front of their Medicine Lodge in Mih-Tuta-Hankush" .)>
#<RDF::Statement:0x3fd7f58df2a4(<http://data.crystalbridges.org/object/2258> <http://www.w3.org/2000/01/rdf-schema#label> "From " .)>

I'm using ruby 2.3.3p222 (2016-11-21 revision 56859) [x86_64-darwin16], and I'm using a gemfile that includes the line gem "linkeddata", '~> 2.1.0', and the Gemfile.lock file says I'm using

sparql-client (2.1.0)
linkeddata (2.1.0)
gkellogg commented 7 years ago

The difference you're seeing is that when you include "linkeddata", it causes the result to be read by the Turtle reader instead of the N-Triples reader. It looks like the N-Triples reader is not handling the embedded double-quote correctly. There are tests, but it looks like they hide this problem and will need to be improved.

In the mean time, please continue to use the Turtle reader, either with the "linkeddata" gem, or by requiring "rdf/turtle".

gkellogg commented 7 years ago

Closing, as bug is in RDF.rb: ruby-rdf/rdf#340.

workergnome commented 7 years ago

Thanks, @gkellogg, as always!

gkellogg commented 7 years ago

@workergnome It ends up that the problem was because UTF-8 data was sent with an ASCII encoding, and the code which attempts to correct this should not have been unending the string first, but simply forcing the encoding. I'm releasing RDF.rb 2.2.1 with the fix.

workergnome commented 7 years ago

Thanks, and I'll let the people responsible for sending the data know, too.