openlink / virtuoso-opensource

Virtuoso is a high-performance and scalable Multi-Model RDBMS, Data Integration Middleware, Linked Data Deployment, and HTTP Application Server Platform
https://vos.openlinksw.com
Other
867 stars 210 forks source link

Different SPARUL results via HTTP interface and Conductor #266

Open sebastianthelen opened 9 years ago

sebastianthelen commented 9 years ago

Hi,

we're running a quite complex SPARQL Insert/Delete query to harmonize our data with respect to owl:sameAs relations between resource identifiers. Basically we want to reorganize all triples so that they're attached to one unique URI instead of multiple semantically equivalent ones.

We noticed that results are incorrect when executing the query via the HTTP interface, i.e., curl. However, when performing the same query in the Conductor interface everything looks ok.

We're running: Virtuoso Open Source Edition (Column Store) (multi threaded) Version 7.1.1-dev.3211-pthreads as of Jul 21 2014 Compiled for Linux (x86_64-unknown-linux-gnu) Copyright (C) 1998-2014 OpenLink Software

The query looks as follows:

PREFIX cdm: http://publications.europa.eu/ontology/cdm# WITH http://cdm/data2 DELETE { ?z rdf:first ?head ; rdf:rest ?tail ; ?x ?y. ?w ?q ?id. ?id ?p ?value; ?p ?o. ?b owl:annotatedSource ?id; owl:annotatedProperty ?p; owl:annotatedTarget ?at1; owl:annotatedTarget ?at2; ?ann ?annotation. } INSERT { ?z rdf:first ?head ; rdf:rest ?tail ; ?x ?y. ?w ?q ?cellar_subject. ?cellar_subject ?p ?value; ?p ?cellar_object. ?b owl:annotatedSource ?cellar_subject; owl:annotatedProperty ?p; owl:annotatedTarget ?ct; owl:annotatedTarget ?at2; ?ann ?annotation. } WHERE { http://publications.europa.eu/resource/cellar/6fe92ab0-c94c-48d5-9193-212f1e3a92dc owl:sameAs ?id. ?id ^owl:sameAs ?cellar_subject.

{ ?id ?p ?o. ?o ^owl:sameAs ?cellar_object. filter ( ?p!=owl:sameAs). } UNION { ?id ?p ?value. filter isLiteral(?value) } UNION { ?id ?p ?value. FILTER not exists {?value ^owl:sameAs ?cellar_id} }

OPTIONAL { ?w ?q ?id. filter (?q!=owl:sameAs). filter (!isBlank(?w)) }

OPTIONAL{ ?b owl:annotatedSource ?id; owl:annotatedProperty ?p; ?ann ?annotation. {?b owl:annotatedTarget ?at1. ?at1 ^owl:sameAs ?ct1. } UNION {?b owl:annotatedTarget ?at2. filter (isLiteral(?at2) || not exists {?at2 ^owl:sameAs ?ct2.})} filter not exists{ ?annotation ^owl:sameAs ?anyID} }

OPTIONAL { ?value (rdf:rest|rdf:first)* ?z. ?z rdf:first ?head ; rdf:rest ?tail . ?z ?x ?y. } }

Sample data can be obtained from: http://pastebin.com/DY6DYmXC

In order to reconstruct the issue perform the following steps:

Are you aware of any similar issues that might explain why those results differ?

Sebastian

sebastianthelen commented 9 years ago

Any ideas?

HughWilliams commented 9 years ago

What is the form of the "curl" query being executed, please provide the complete curl command being executed.

Also, what the Conductor interface you have successfully executed the query via , was the the SPARQL UI or or via the SQL UI (prepended the query with the "SPARQL" keyword )?

Have you tried executing the query using the /sparql-auth SPARQL endpoint ?

sebastianthelen commented 9 years ago

Hi,

we ran the following curl query:

curl -i -d 'PREFIX cdm: http://publications.europa.eu/ontology/cdm# WITH http://cdm/data2 DELETE { ?z rdf:first ?head ; rdf:rest ?tail ; ?x ?y.?w ?q ?id.?id ?p ?value; ?p ?o. ?b owl:annotatedSource ?id; owl:annotatedProperty ?p; owl:annotatedTarget ?at1; owl:annotatedTarget ?at2; ?ann ?annotation. } INSERT{ ?z rdf:first ?head ; rdf:rest ?tail ; ?x ?y. ?w ?q ?cellar_subject.?cellar_subject ?p ?value; ?p ?cellar_object.?b owl:annotatedSource ?cellar_subject; owl:annotatedProperty ?p; owl:annotatedTarget ?ct; owl:annotatedTarget ?at2; ?ann ?annotation. } WHERE { http://publications.europa.eu/resource/cellar/1757f2ec-06ff-4453-871a-02e19eb1c59b.0002 owl:sameAs ?id.?id ^owl:sameAs ?cellar_subject.{ ?id ?p ?o. ?o ^owl:sameAs ?cellar_object. filter ( ?p!=owl:sameAs). } UNION{ ?id ?p ?value. filter isLiteral(?value) } UNION { ?id ?p ?value. FILTER not exists {?value ^owl:sameAs ?cellar_id} }OPTIONAL {?w ?q ?id.filter (?q!=owl:sameAs).filter (!isBlank(?w))} OPTIONAL{ ?b owl:annotatedSource ?id; owl:annotatedProperty ?p; ?ann ?annotation. {?b owl:annotatedTarget ?at1. ?at1 ^owl:sameAs ?ct1. } UNION{?b owl:annotatedTarget ?at2. filter (isLiteral(?at2) || not exists {?at2 ^owl:sameAs ?ct2.})}filter not exists{ ?annotation ^owl:sameAs ?anyID}} OPTIONAL { ?value (rdf:rest|rdf:first)* ?z.?z rdf:first ?head ; rdf:rest ?tail . ?z ?x ?y. } } ' -u "dba:dba" -H "Content-Type: application/sparql-query" http://abel:8890/DAV/home/dba/rdf_sink/myreq

The query was executed using the SPARQL UI of the conductor interface.

No, we haven't used the /sparql-auth SPARQL endpoint.

HughWilliams commented 9 years ago

Thanks for the curl query, which I have now been able to run, but running the query via both methods indicated I get the same results, which is:

SQL> SPARQL PREFIX cdm: http://publications.europa.eu/ontology/cdm# select * from http://cdm/data2 where {http://publications.europa.eu/resource/celex/11997E249 (owl:sameAs|^owl:sameAs)* ?id. ?id ?p ?o.}; id p o VARCHAR VARCHAR VARCHAR


http://publications.europa.eu/resource/celex/11997E249 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://publications.europa.eu/ontology/cdm#fragment_resource_legal http://publications.europa.eu/resource/celex/11997E249 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://publications.europa.eu/ontology/cdm#resource_legal http://publications.europa.eu/resource/celex/11997E249 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://publications.europa.eu/ontology/cdm#treaty http://publications.europa.eu/resource/celex/11997E249 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://publications.europa.eu/ontology/cdm#work http://publications.europa.eu/resource/cellar/6fe92ab0-c94c-48d5-9193-212f1e3a92dc http://www.w3.org/2002/07/owl#sameAs http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_has_type_act_concept_type_act http://publications.europa.eu/resource/authority/fd_030/TRAITE http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/annotation#build_info cdm:CDM_2.1.7 tdm:1523 xslt:3945 saxon:9.0.0.1J JVM:1.6.0_29 metaconvJar:1.2.0 builddate:18/07/2014 16:59:54 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#date_creation_legacy 1999-05-31 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#fragment_resource_legal_id_fragment 249 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#fragment_resource_legal_part_of_resource_legal http://publications.europa.eu/resource/celex/11997E/TXT http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_comment_internal VIG1O; #MAN1; PAG10278; #BAS1O; # http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_date_end-of-validity 9999-12-31 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_date_entry-into-force 1958-01-01 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_date_signature 1957-03-25 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_id_celex 11997E249 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_id_sector 1 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_in-force 1 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_is_about_subject-matter http://publications.europa.eu/resource/authority/fd_070/INST http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_type E http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/DAN http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/DEU http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/ELL http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/ENG http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/FIN http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/FRA http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/GLE http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/ITA http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/NLD http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/POR http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/SPA http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_uses_originally_language http://publications.europa.eu/resource/authority/language/SWE http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#resource_legal_year 1997-01-01 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_created_by_agent http://publications.europa.eu/resource/authority/corporate-body/EUMS6 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_date_creation 2014-07-18 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_date_creation_legacy 1999-05-31 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_date_document 1957-03-25 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_id_document celex:11997E249 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_id_obsolete_notice 218997 http://publications.europa.eu/resource/celex/11997E249 http://publications.europa.eu/ontology/cdm#work_part_of_collection_document http://publications.europa.eu/resource/authority/document-collection/CELEX

39 Rows. -- 4 msec. SQL>

Note, though I am running a much newer develop/7 binary that your build from Jul 2014:

SQL> status(''); REPORT VARCHAR


OpenLink Virtuoso Server Version 07.10.3211-pthreads for Linux as of Dec 11 2014 Started on: 2014-12-14 21:52 GMT+1

Thus I would suggest you build against the latest develop/7 build and try again ...

indeyets commented 9 years ago

I'm sorry for an offtopic, but fenced code blocks would make comments in this issue so much more readable. :-/

sebastianthelen commented 9 years ago

Sorry for not getting back to you earlier.

First of all, thanks a lot for your reply! We will check out the new release and see what happens.

Sebastian

jibe-b commented 9 years ago

Hi,

I get a similar problem, but with a simpler query: When passing a SPARQL query to the virtuoso database, the following query does not return the same result when passed through the web interface and by an http request on the DAV service:

insert { graph ?g2 { ?s a ?type }} where { graph ?g {ex:something a ?type ; ?s ex:sameAs ex:Something .} BIND(URI(CONCAT(?g, '/deduced')) AS ?g2) }

Through the web interface, triples are inserted into the corresponding graphs, like expected.

But when passed as a http request (using curl, on http://localhost:8890/DAV/home/dba/rdf_sink), there is no result, unless

FROM NAMED http://graph FROM NAMED http://graph/deduced

is added, for each graph that we consider.

How can I make it possible for a SPARQL query passed by http request on the DAV service to have access to the index of graphs?

Note: both SPARQL and dba (used in the http request) have all rights. Note2: Using version: 7.2.0_p1.3212-pthreads as of May 13 2015

Best regards,

jibe

jibe-b commented 9 years ago

Another (simpler) request that returns the expected results using Web interface and none via http request:

SELECT ?something WHERE { GRAPH http://graph { ?s http://predicate ?o . ?o http://predicate2 ?something }} limit 10

jibe-b commented 9 years ago

This form fails, too:

SELECT ?something FROM http://graph WHERE { ?s http://predicate ?o . ?o http://predicate2 ?something } limit 10

kidehen commented 9 years ago

On 5/29/15 9:36 AM, jibe-b wrote:

Hi,

I get a similar problem, but with a simpler query: When passing a SPARQL query to the virtuoso database, the following query does not return the same result when passed through the web interface and by an http request on the DAV service:

insert { graph ?g2 { ?s a ?type }}
where {
graph ?g {ex:something a ?type ;
?s ex:sameAs ex:Something .}
BIND(URI(CONCAT(?g, '/deduced')) AS ?g2)
}

Through the web interface, triples are inserted into the corresponding graphs, like expected.

The "virt:rdf_graph" property of the rdf_sink folder determines the named graph IRI used for data that enters the quad store via this route. There can only be one assigned named graph IRI serving as the value of this property.

But when passed as a http request (using curl, on http://localhost:8890/DAV/home/dba/rdf_sink), there is no result, unless

FROM NAMED http://graph
FROM NAMED http://graph/deduced

Your query above leads to triples being placed in the named graphs listed.

is added, for each graph that we consider.

How can I make it possible for a SPARQL query passed by http request on the DAV service to have access to the index of graphs?

Have you tried using SELECT {select-variable-list} WHERE {GRAPH ?g {query-body} } ? That will make the default graph for the solution a union of all named graphs in the quad store.

Example:

select distinct * where { graph ?g {?s ?p ?o} } limit 10

You should get the same solution irrespective of sparql query execution mechanism.

Note: both SPARQL and dba (used in the http request) have all rights.

Best regards,

jibe

Regards,

Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

kidehen commented 9 years ago

On 5/29/15 1:54 PM, jibe-b wrote:

This form fails, too:

SELECT ?something
FROM http://graph
WHERE {
?s http://predicate ?o .
?o http://predicate2 ?something
}
limit 10
SELECT ?something

WHERE { GRAPH ?g {
?s http://predicate ?o .
?o http://predicate2 ?something }
}
limit 10

Regards,

Kingsley Idehen Founder & CEO OpenLink Software Company Web: http://www.openlinksw.com Personal Weblog 1: http://kidehen.blogspot.com Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen Twitter Profile: https://twitter.com/kidehen Google+ Profile: https://plus.google.com/+KingsleyIdehen/about LinkedIn Profile: http://www.linkedin.com/in/kidehen Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this

jibe-b commented 9 years ago

Thanks for the answers, but there are still queries that do not behave the same way using web interface and http requests.

For example:

select distinct * where { graph ?g {?s ?p ?o} } limit 10

works fine, but:

select * where {graph http://a {?s ?p ?o}}limit 10

returns nothing using http requests, and what I expect using web interface.

I need to indicate the graph that I query:

select * from named http://a where {graph http://a {?s ?p ?o}}limit 10

is: web interface: OK ; http requests: OK.

One solution, indeed, is to use from or from named, but when I want to query several graphs, chosing them on the basis of the triples that are inside, then the behaviour of the web interface is the optimum, enabling to pass queries like:

select * where { graph ?g {?s ?p ?o} graph ?g2 {?S ?P ?O} }limit 10

without having to indicate any from or from named graphs.