add `rml:baseIRI` property to ontology, shapes and spec

andimou commented 2 years ago

is the description/definition of the base IRI sufficient as it comes from the R2RML spec or do we need to further clarify certain aspects of it?

dachafra commented 2 years ago

IMO we should explicitly say in the spec that the base IRI for the RDF triples is not the same as for the mapping language. My proposal will redefine the base IRI definition as:

The base IRI should be provided in the config file or CLI of the engine
The base IRI of the mapping document refers to the actual rules

Will this enforce to provide a base IRI always in the execution of an engine?

dachafra commented 2 years ago

We have a proposal here for the RML-core.

bblfish commented 2 years ago

I agree this needs to be documented in the spec, as the existence of two relative URLs is a necessary part of the setup. I think there is a lightweight solution that should also be explained. If neither base url of the mapping document is given (via @base or command line argument) then

the relative URL of the document (should just be file:// if found on the file system, or take from the location of the document on the web) -- this is usual Relative URL interpretation, but programmers may not be aware of this. If the relative url of the output is not given by command line argument (or specified in some way by ontology extension) then the mapper should
- produce output (which should just be a relative url if the -b attribute is not specified

bblfish commented 2 years ago

In rmlmapper-java I noticed that the following gives blank nodes

<#allDataMap> a rr:TriplesMap;
    rr:subjectMap [
        rdfs:seeAlso <https://github.com/kg-construct/rml-questions/discussions/25>;
        rr:termType rr:BlankNode;
        rr:template "node{ }";
        rr:class sosa:Observation;
    ] .

But the following gives URLs (when called with -b <url> command line argument, otherwise it gives a blank node)

<#allDataMap> a rr:TriplesMap;
    rr:subjectMap [
        rdfs:seeAlso <https://github.com/kg-construct/rml-questions/discussions/25>;
        rr:template "node{ }";
        rr:class sosa:Observation;
    ] .

Note though that the second is a subgraph of the first. An RDF Graph should imply all its subgraphs, yet here we have the removal of rr:BlankNode changing the output from one that produces blank nodes to one that produces URLs. In a way it should be the other way around as URLs require more information than blank nodes.

But instead I would suggest using two different exclusive relations rr:template and rr:blankNodeTemplate perhaps.

pmaria commented 2 years ago

@bblfish

produce output (which should just be a relative url if the -b attribute is not specified

As far as I know there exists no relative URI without a base. So as per R2RML

An R2RML processor also has access to an execution environment consisting of:

A SQL connection to the input database,

a base IRI used in resolving relative IRIs produced by the R2RML mapping.

A processor must have access to a base IRI. Usually if not provided, a processor implementation will have a default.

pmaria commented 2 years ago

@bblfish

In rmlmapper-java I noticed that the following gives blank nodes
<#allDataMap> a rr:TriplesMap;
    rr:subjectMap [
        rdfs:seeAlso <https://github.com/kg-construct/rml-questions/discussions/25>;
        rr:termType rr:BlankNode;
        rr:template "node{ }";
        rr:class sosa:Observation;
    ] .
But the following gives URLs (when called with -b <url> command line argument, otherwise it gives a blank node)
<#allDataMap> a rr:TriplesMap;
    rr:subjectMap [
        rdfs:seeAlso <https://github.com/kg-construct/rml-questions/discussions/25>;
        rr:template "node{ }";
        rr:class sosa:Observation;
    ] .
Note though that the second is a subgraph of the first. An RDF Graph should imply all its subgraphs, yet here we have the removal of rr:BlankNode changing the output from one that produces blank nodes to one that produces URLs. In a way it should be the other way around as URLs require more information than blank nodes.

R2RML defines default values rr:termType depending on the type of term map. So the term type is actually implied when not specified.

But instead I would suggest using two different exclusive relations rr:template and rr:blankNodeTemplate perhaps.

I would prefer not to go this route, as this would introduce more language as well as break backwards compatibility with R2RML and RML in its current form.

bblfish commented 2 years ago

R2RML defines default values rr:termType depending on the type of term map. So the term type is actually implied when not specified.

I am just wondering if default meanings of relations makes for valid RDF. The RDF Semantics spec states that a graph implies its subgraphs. So one needs to clarify what is going on here.

pmaria commented 2 years ago

Agreed, this has to be clarified.

andimou commented 9 months ago

A proposal for this has as follows:

<#TriplesMap>
    rml:baseIri "http://example.com/" .

The base IRI of the Triples Map is used in resolving relative IRIs produced by the R2RML mapping.

such a base IRI would allow us then to write something along these lines:

<#TriplesMap>
   rml:baseIri "http://example.com/" .
   rml:subjectMap [
      rml:template "{id}";
      rml:class :Person ].

for a CSV like the following:

id | name 1 | Bob

and that would create the following triples:

<http://example.com/1> a :Person.

If we want to go a step further, we could allow such a base IRI to be defined not only on Triples Map level but also on Term Map level.

That should not be confused with the base IRI of the document.

@base <http://mygraph.org/> .

<TriplesMap>
   rml:baseIri "http://example.com/" .
   rml:subjectMap [
      rml:template "{id}";
      rml:class :Person ].

the document's base will refer to the relative IRIs of the mapping document, so in this case it will be <http://mygraph.org/TriplesMap>.

chrdebru commented 7 months ago

Question. The proposal above seems fine, but does that mean we eliminate the base IRI given as input (i.e., specified in the config file)?

If yes, then we need to specify this (but we would need to declare that for at least every triples map). If no, then we need to specify the behavior of a input base IRI and the ones declared at triples map, and term map levels. We might end up with something complicated as the graph map declarations on subject map and predicate-object map level.

I would prefer to see the second as it would render the mapping less verbose, but then specify that each rml:baseIRI assertion overrides the previous one.

dachafra commented 6 months ago

Question. The proposal above seems fine, but does that mean we eliminate the base IRI given as input (i.e., specified in the config file)?

I would say yes, better to have it in the mapping than in the config file

If yes, then we need to specify this (but we would need to declare that for at least every triples map).

I would not see any problem on this

dachafra commented 1 month ago

A proposal for this has as follows:
<#TriplesMap>
    rml:baseIRI <http://example.com/> .
The base IRI of the Triples Map is used in resolving relative IRIs produced by the R2RML mapping.

such a base IRI would allow us then to write something along these lines:
<#TriplesMap>
   rml:baseIRI <http://example.com/> .
   rml:subjectMap [
      rml:template "{id}";
      rml:class :Person ]. 
for a CSV like the following:

id | name 1 | Bob

and that would create the following triples:

<http://example.com/1> a :Person.

If we want to go a step further, we could allow such a base IRI to be defined not only on Triples Map level but also on Term Map level.

That should not be confused with the base IRI of the document.
@base <http://mygraph.org/> .

<TriplesMap>
   rml:baseIRI <http://example.com/> .
   rml:subjectMap [
      rml:template "{id}";
      rml:class :Person ]. 
the document's base will refer to the relative IRIs of the mapping document, so in this case it will be <http://mygraph.org/TriplesMap>.

This is the proposed solution for this issue. Currently, the spec still specifies the necessity of CLI access to the base IRI (see)

Actions points:

[x] Update ontology to add rml:baseIRI property @anaigmo
[x] Update shacl shapes to add rml:baseIRI constraints @DylanVanAssche
[ ] Update spec to include the description of the property. @pmaria @andimou seems the first version of the baseIRI for the spec appears here https://github.com/kg-construct/rml-core/pull/128/commits/0bdf8b946bedd202663b091ebcf5caf7631f3aeb. Could you take a look and see if any modification is needed?

kg-construct / rml-core

add `rml:baseIRI` property to ontology, shapes and spec #30