SPARQL-Anything / sparql.anything

SPARQL Anything is a system for Semantic Web re-engineering that allows users to ... query anything with SPARQL.
https://sparql-anything.cc/
Apache License 2.0
217 stars 11 forks source link

Relational database #6

Open enridaga opened 3 years ago

enridaga commented 3 years ago

Support relational databases, for example, by developing a connector using JDBC.

enridaga commented 3 years ago

Developing R2RML mappings to Facade-X should in principle allow building a component that does query rewriting rather then transform the whole table prior to query execution (like we are doing currently with CSVs ...)

enridaga commented 3 years ago

However, D2RQ seems it is not active (Github project archived), also the other R2RML Java projects mentioned here. I wonder where to find a robust and maintained Java implementation ...

akuckartz commented 3 years ago

Maybe you can join forces with https://ontop-vkg.org/ ?

akuckartz commented 3 years ago

Ontop is a Virtual Knowledge Graph system. It exposes the content of arbitrary relational databases as knowledge graphs. These graphs are virtual, which means that data remains in the data sources instead of being moved to another database.

Ontop translates SPARQL queries expressed over the knowledge graphs into SQL queries executed by the relational data sources. It relies on R2RML mappings and can take advantage of lightweight ontologies. https://ontop-vkg.org/guide/

enridaga commented 3 years ago

Definitely something to try! The open question is whether mappings in R2RML can be defined at the meta level (for example, expressing things such as "for each table/column" without needing to actually encode the schema elements in the mappings. If this is possible, we could design mappings to Facade-X once for all and give to users access to any RDB on the fly.

Aklakan commented 3 years ago

Hi, just some input which might be of interest here:

In SANSA we created an integration of ontop and sparqlify with Apache Spark. Disclaimer: I am the developer of the sparql-to-sql rewriter Sparqlify.

For this purpose we created this jena-based R2RML layer - which is just the R2RML tooling without the query rewriting (though it includes a simple ARQ-based materializing R2RML processor which succeeds on all R2RML test cases). For the ontop integration the jena model gets wrapped with commons-rdf from where ontop picks it up.

In any case, for the interlinking tool LIMES I once made a proposal (and prototype) which might be relevant here as well:

One could exploit nested service clauses to syntactically provide RDF-based mapping information:

SERVICE <x-sparql-anything:r2rml:ontop:jdbc:connection-string> {
  SERVICE <mapping:inline> { r2rml content goes here
    [ a                      rr:TriplesMap ;
      rr:predicateObjectMap  [  ... ] ]
  }
  SERVICE <query> { # query goes here
    { SELECT COUNT(*) { ?s ?p ?o }
  }

Of course, mapping could be provided externally using <mapping:http://somesource>.

The R2RML spec also defines a default mapping for relation database, called the direct mapping which I suppose is pretty much the recipe for creating default R2RML mappings. Hence, if no explicit mapping is provided, this is the one that can be generated by default. Internally, the extended sparql processor may cache the generated mapping with the connection string and use it whenever no other mapping is requested.

justin2004 commented 3 years ago

The open question is whether mappings in R2RML can be defined at the meta level (for example, expressing things such as "for each table/column" without needing to actually encode the schema elements in the mappings.

i think R2RML needs rr:tableName but usually there is a table of tables (and a table of columns), right?

e.g. in postgres:

select * from pg_catalog.pg_tables;

if there is a common JDBC way to get at the table of tables then we could macro expand to produce the R2RML as needed. if there is not a common JDBC way to get at the table of tables then we could just make a big switch statement with a case for each flavor of RDB.

enridaga commented 1 year ago

Development started on branch jdbc https://github.com/SPARQL-Anything/sparql.anything/tree/jdbc