3-Round-Stones / callimachus

Callimachus is a highly scalable platform for creating and running data-driven websites
Other
95 stars 24 forks source link

Query to RDF File (Remote) #225

Closed normansuesstrunk closed 9 years ago

normansuesstrunk commented 9 years ago

I tried the following named query in callimachus 1.5.0 Beta:

PREFIX geovocab-geom2:  <http://geovocab.org/geometry#>
PREFIX lgdo:    <http://linkedgeodata.org/ontology/>
PREFIX lgdm:    <http://linkedgeodata.org/meta/>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX lgd-addr:  <http://linkedgeodata.org/ontology/addr%3A>
PREFIX spy:     <http://aksw.org/sparqlify/>
PREFIX lu:      <http://id.sirf.net/def/lu#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lgd:     <http://linkedgeodata.org/triplify/>
PREFIX ogc:     <http://www.opengis.net/ont/geosparql#>
PREFIX wgs:     <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX lgd-adress:  <http://linkedgeodata.org/ontology/addr/>
PREFIX geovocab-geom:  <http://geovocab.org/geometry>
PREFIX lgd-geom:  <http://linkedgeodata.org/geometry/>
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>
PREFIX owl:     <http://www.w3.org/2002/07/owl#>
PREFIX geovocab-spatial:  <http://geovocab.org/spatial#>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos:    <http://www.w3.org/2004/02/skos/core#>
PREFIX lgd-contact:  <http://linkedgeodata.org/ontology/contact%3A>
PREFIX libStat: <http://linkeddata.fh-htwchur.ch/ontology/libStat/>

SELECT ?libStat ?osmLibNodeId ?numberOfBooks
FROM    <http://ckannet-storage.commondatastorage.googleapis.com/2015-08-14T12:07:12.200Z/library-statistics-data.ttl>
WHERE {
    ?libStat a libStat:LibStatistic
    ; libStat:osmLib ?osmLibNodeId
    ; libStat:numberOfBooks ?numberOfBooks .
}

The result is empty, but the query should return 2 rows.

I get no error message, so I'm not sure if this is supported at all. If this should work, what am I missing? I found nothing in the log files.

many thanks, Norman

catch-point commented 9 years ago

The from clause must be of RDF files in the local store. Callimachus will not auto download and index remote RDF files. Does that answer your question?

James

prototypo commented 9 years ago

I think Norman’s question suggests that he wants to change his query to use a SPARQL federated query with a SERVICE clause.

Norman, please see: http://www.w3.org/TR/2013/REC-sparql11-federated-query-20130321/#simpleService http://www.w3.org/TR/2013/REC-sparql11-federated-query-20130321/#simpleService

Regards,

Dave

http://about.me/david_wood

On Aug 17, 2015, at 11:12, James Leigh notifications@github.com wrote:

The from clause must be of RDF files in the local store. Callimachus will not auto download and index remote RDF files. Does that answer your question?

James

— Reply to this email directly or view it on GitHub https://github.com/3-Round-Stones/callimachus/issues/225#issuecomment-131857316.

normansuesstrunk commented 9 years ago

Many thanks for the response.

I wondered how to integrate "plain" remote rdf data into callimachus. The requested rdf data are just stored in a rdf file that lies on a webserver and this data are not accessible through a sparql endpoint. So a federated query is not a solution in this particular case.

API's like Jena or Rasqal (http://librdf.org/rasqal/) are able to handle the mentioned query - i think they download the data/file into RAM, query the data with sparql and then return the result.

I think this is not a functionality that is specified by Sparql itself - it's a specific feature that jena or rasqal implemented. As I see, this is not supported by Callimachus.

I'm working at the University of Applied Science Chur (http://www.htwchur.ch/) and i discussed shortly with a college it it would be an idea to realise a Bachelor Thesis to implement that feature. If your interested, please contact me at norman.suesstrunk@htwchur.ch

I cloned the code from Git and having quite a bit of Experience in Java, i had some thoughts about changing the project to Maven to manage the dependencies. Some Links to the Jars are dead so the ant target won't run. If you're open to collaborate, i'll open a new Issue.

Many thanks, Norman

prototypo commented 9 years ago

Hi Norman,

You should SPARQL’s GRAPH keyword (instead of SERVICE) if you are querying RDF in a flat file: http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#queryDataset http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#queryDataset

Use the the SERVICE clause for remote SPARQL endpoints, and GRAPH when you want to query RDF files at a known URL.

Here is an example: [[ PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

SELECT ?s { GRAPH https://example.com/data/FinancialDates.rdf { ?s a owl:Class .

FILTER( REGEX( STR(?s), "^http://www.omg.org", "i") )

}

} LIMIT 1000 ]]

OK?

Regards,

Dave

http://about.me/david_wood

On Aug 20, 2015, at 11:47, Norman Süsstrunk notifications@github.com wrote:

Many thanks for the response.

I wondered how to integrate "plain" remote rdf data into callimachus. The requested rdf data are just stored in a rdf file that lies on a webserver and this data are not accessible through a sparql endpoint. So a federated query is not a solution in this particular case.

API's like Jena or Rasqal (http://librdf.org/rasqal/ http://librdf.org/rasqal/) are able to handle the mentioned query - i think they download the data/file into RAM, query the data with sparql and then return the result.

I think this is not a functionality that is specified by Sparql itself - it's a specific feature that jena or rasqal implemented. As I see, this is not supported by Callimachus.

I'm working at the University of Applied Science Chur (http://www.htwchur.ch/ http://www.htwchur.ch/) and i discussed shortly with a college it it would be an idea to realise a Bachelor Thesis to implement that feature. If your interested, please contact me at norman.suesstrunk@htwchur.ch mailto:norman.suesstrunk@htwchur.ch I cloned the code from Git and having quite a bit of Experience in Java, i had some thoughts about changing the project to Maven to manage the dependencies. Some Links to the Jars are dead so the ant target won't run. If you're open to collaborate, i'll open a new Issue.

Many thanks, Norman

— Reply to this email directly or view it on GitHub https://github.com/3-Round-Stones/callimachus/issues/225#issuecomment-133056999.

catch-point commented 9 years ago

Don't forget to use the LOAD command first to load the remote data file into the local store.

James

From: David Wood Sent: Thursday, 20 August, 2015 12:21 To: 3-Round-Stones/callimachus CC: James Leigh Subject: Re: [callimachus] Query to RDF File (Remote) (#225)

Hi Norman,

You should SPARQL’s GRAPH keyword (instead of SERVICE) if you are querying RDF in a flat file: http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#queryDataset http://www.w3.org/TR/2013/REC-sparql11-query-20130321/#queryDataset

Use the the SERVICE clause for remote SPARQL endpoints, and GRAPH when you want to query RDF files at a known URL.

Here is an example: [[ PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#

SELECT ?s { GRAPH https://example.com/data/FinancialDates.rdf { ?s a owl:Class .

FILTER( REGEX( STR(?s), "^http://www.omg.org", "i") )

} } LIMIT 1000 ]]

OK?

Regards,

Dave

http://about.me/david_wood

On Aug 20, 2015, at 11:47, Norman Süsstrunk notifications@github.com wrote:

Many thanks for the response.

I wondered how to integrate "plain" remote rdf data into callimachus. The requested rdf data are just stored in a rdf file that lies on a webserver and this data are not accessible through a sparql endpoint. So a federated query is not a solution in this particular case.

API's like Jena or Rasqal (http://librdf.org/rasqal/ http://librdf.org/rasqal/) are able to handle the mentioned query - i think they download the data/file into RAM, query the data with sparql and then return the result.

I think this is not a functionality that is specified by Sparql itself - it's a specific feature that jena or rasqal implemented. As I see, this is not supported by Callimachus.

I'm working at the University of Applied Science Chur (http://www.htwchur.ch/ http://www.htwchur.ch/) and i discussed shortly with a college it it would be an idea to realise a Bachelor Thesis to implement that feature. If your interested, please contact me at norman.suesstrunk@htwchur.ch mailto:norman.suesstrunk@htwchur.ch I cloned the code from Git and having quite a bit of Experience in Java, i had some thoughts about changing the project to Maven to manage the dependencies. Some Links to the Jars are dead so the ant target won't run. If you're open to collaborate, i'll open a new Issue.

Many thanks, Norman

— Reply to this email directly or view it on GitHub https://github.com/3-Round-Stones/callimachus/issues/225#issuecomment-133056999.

— Reply to this email directly or view it on GitHub.

normansuesstrunk commented 9 years ago

Many thanks for the response.

Using Sparql Graph Feature My understandig of the Graph Feature in Sparql is to query multiple graphs in the same query so this is not a feature to include remote rdf files:

The use of GRAPH changes the active graph for matching graph patterns within that part of the query. Outside the use of GRAPH, matching is done using the default graph. 

Here is my query with the GRAPH Keyword:

PREFIX geovocab-geom2:  <http://geovocab.org/geometry#>
PREFIX lgdo:    <http://linkedgeodata.org/ontology/>
PREFIX lgdm:    <http://linkedgeodata.org/meta/>
PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
PREFIX lgd-addr:  <http://linkedgeodata.org/ontology/addr%3A>
PREFIX spy:     <http://aksw.org/sparqlify/>
PREFIX lu:      <http://id.sirf.net/def/lu#>
PREFIX dcterms:  <http://purl.org/dc/terms/>
PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
PREFIX lgd:     <http://linkedgeodata.org/triplify/>
PREFIX ogc:     <http://www.opengis.net/ont/geosparql#>
PREFIX wgs:     <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX lgd-adress:  <http://linkedgeodata.org/ontology/addr/>
PREFIX geovocab-geom:  <http://geovocab.org/geometry>
PREFIX lgd-geom:  <http://linkedgeodata.org/geometry/>
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>
PREFIX owl:     <http://www.w3.org/2002/07/owl#>
PREFIX geovocab-spatial:  <http://geovocab.org/spatial#>
PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX skos:    <http://www.w3.org/2004/02/skos/core#>
PREFIX lgd-contact:  <http://linkedgeodata.org/ontology/contact%3A>
PREFIX libStat: <http://linkeddata.fh-htwchur.ch/ontology/libStat/>

SELECT ?libStat ?osmLibNodeId ?numberOfBooks {
    GRAPH   <http://ckannet-storage.commondatastorage.googleapis.com/2015-08-17T14:29:58.955Z/library-statistics-data.rdf> {
        ?libStat a libStat:LibStatistic
        ; libStat:osmLib ?osmLibNodeId
        ; libStat:numberOfBooks ?numberOfBooks .
    }
}

The uri after the GRAPH keyword defines the name of the graph. The graph itself must be loaded first as mentioned by James. But again, this is not what i want here. I want to query remote data from a file, i do not want to load it in my local callimachus rdf store. If I do that, i don't need the GRAPH feature as i just need to query one graph which is the default graph stored in the local callimachus rdf store.

I'm still not sure why the solution with referencing the file in the FROM clause does not work. Running the following query on the sparql endpoint http://sparql.org/sparql.html, i get results:

PREFIX vCard: <http://www.w3.org/2001/vcard-rdf/3.0#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>

SELECT  ?name ?url ?title
FROM <http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf>
WHERE { 
  OPTIONAL { ?X foaf:name ?name . FILTER isLiteral(?name) . }
  OPTIONAL { ?X foaf:homepage ?url . FILTER isURI(?url) . }
  OPTIONAL { ?X foaf:title ?title . FILTER isLiteral(?title) . }
}

This is exactly what i want - querying a remote rdf file. If implement this query in callimachus with a named query, I get no results.

I still think that this should work in callimachus as this is specified in the sparql language.

EDIT: No, i was wrong. There is a good explanation on stackoverflow:

http://stackoverflow.com/questions/30433069/sesame-workbench-querying-an-online-data-set

So, essentially, this is not specified in the sparql language. If it does work in a sparql endpoint, this is specific to the underlying implementation. In Sesame/Fuseki (wich is used in callimachus "under the hood") this behaviour of querying remote rdf files is not supported.

catch-point commented 9 years ago

Downloading remote files to query is not required functionality for a SPARQL endpoint and is not enabled in Callimachus. However, you can setup multiple RDF Data source and some of them may have this feature enabled. Callimachus Enterprise includes an interface to setup any Sesame Repository using a config file, which could be used for this. Callimachus can also use an external SPARQL 1.1 endpoint that could have this feature enabled.

Regards, James

On Mon, 2015-08-24 at 00:55 -0700, Norman Süsstrunk wrote:

Many thanks for the response.

Using Sparql Graph Feature My understandig of the Graph Feature in Sparql is to query multiple graphs in the same query so this is not a feature to include remote rdf files:

The use of GRAPH changes the active graph for matching graph patterns within that part of the query. Outside the use of GRAPH, matching is done using the default graph.

Here is my query with the GRAPH Keyword:

PREFIX geovocab-geom2: http://geovocab.org/geometry# PREFIX lgdo: http://linkedgeodata.org/ontology/ PREFIX lgdm: http://linkedgeodata.org/meta/ PREFIX foaf: http://xmlns.com/foaf/0.1/ PREFIX lgd-addr: http://linkedgeodata.org/ontology/addr%3A PREFIX spy: http://aksw.org/sparqlify/ PREFIX lu: http://id.sirf.net/def/lu# PREFIX dcterms: http://purl.org/dc/terms/ PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX lgd: http://linkedgeodata.org/triplify/ PREFIX ogc: http://www.opengis.net/ont/geosparql# PREFIX wgs: http://www.w3.org/2003/01/geo/wgs84_pos# PREFIX lgd-adress: http://linkedgeodata.org/ontology/addr/ PREFIX geovocab-geom: http://geovocab.org/geometry PREFIX lgd-geom: http://linkedgeodata.org/geometry/ PREFIX xsd: http://www.w3.org/2001/XMLSchema# PREFIX owl: http://www.w3.org/2002/07/owl# PREFIX geovocab-spatial: http://geovocab.org/spatial# PREFIX rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# PREFIX skos: http://www.w3.org/2004/02/skos/core# PREFIX lgd-contact: http://linkedgeodata.org/ontology/contact%3A PREFIX libStat: http://linkeddata.fh-htwchur.ch/ontology/libStat/

SELECT ?libStat ?osmLibNodeId ?numberOfBooks { GRAPH http://ckannet-storage.commondatastorage.googleapis.com/2015-08-17T14:29:58.955Z/library-statistics-data.rdf { ?libStat a libStat:LibStatistic ; libStat:osmLib ?osmLibNodeId ; libStat:numberOfBooks ?numberOfBooks . } }

The uri after the GRAPH keyword defines the name of the graph. The graph itself must be loaded first as mentioned by James. But this is not what i want here - i want to include remote data from a file, i do not want to load it in my local callimachus rdf store. If I do that, i don't need the GRAPH feature as i just query on graph - the default graph is the one stored in the callimachus rdf store.

I'm still not sure why the solution with referencing the file in the FROM clause does not work. Running the following query on the sparql endpoint http://sparql.org/sparql and http://demo.openlinksw.com/sparql, i get results:

PREFIX vCard: http://www.w3.org/2001/vcard-rdf/3.0# PREFIX foaf: http://xmlns.com/foaf/0.1/

SELECT ?name ?url ?title FROM http://dig.csail.mit.edu/2008/webdav/timbl/foaf.rdf WHERE { OPTIONAL { ?X foaf:name ?name . FILTER isLiteral(?name) . } OPTIONAL { ?X foaf:homepage ?url . FILTER isURI(?url) . } OPTIONAL { ?X foaf:title ?title . FILTER isLiteral(?title) . } }

This is exactly what i want - querying a remote rdf file. If implement this query in callimachus with a named query, I get no results.

I still think that this should work in callimachus as this is specified in the sparql language.

— Reply to this email directly or view it on GitHub.

normansuesstrunk commented 9 years ago

Hi James,

I completely agree with all of this - many thanks.

I'll close the issue.