RMLio / rml-implementation-report

Implementation report for RML tools
MIT License
3 stars 5 forks source link

What's the relation to R2RML tests? #10

Open VladimirAlexiev opened 5 years ago

VladimirAlexiev commented 5 years ago

RML is an extension of R2RML, so I would expect that RML processors are also exercised against the R2RML test suite.

What is your policy about that?

VladimirAlexiev commented 5 years ago

@andimou replied: RML test cases are derived from #R2RML test cases but generalized for heterogeneous data. As they are transferred from R2RML, they don't cover yet all heterogeneity challenges but we work on it! See more details here:

Thanks for the info!!

andimou commented 5 years ago

it should be possible to add the 4 RML processors to https://www.w3.org/TR/rdb2rdf-implementations/#R2RML-Processors, right?

hmmm yes and no :) As the R2RML test cases are transferred to RML, the RML vocabulary is used to describe the mapping rules, including the Logical Source description. Thus, in the case of relational databases, an RML document is expected to be given (and not R2RML) where the database description is explicit (to distinguish from other data sources) as opposed to R2RML where the database is not mentioned but just a specific table.

I guess that all tools that support the corresponding RML test cases for relational databases could easily work with pure R2RML descriptions but, to the extend that I can say, they do not support pure R2RML, even though it would make sense :)

andimou commented 5 years ago

Then regarding

R2RMLTC0014b, R2RMLTC0014c, R2RMLTC0014d are missing

We deliberately did not include test cases with inverse expressions. We explained in the paper publication why:

Inverse Expressions. 3 of the R2RML test cases are designed to test the use of inverse expressions26. However, inverse expressions are only used to optimize the knowledge graph generation and no differences are observed in the generated knowledge graph. Thus, whether inverse expressions are used by a processor or not cannot be verified by such test cases. Thus, we do not include them for RML.

but let us know if you disagree!

VladimirAlexiev commented 5 years ago

they do not support pure R2RML

Defining the database inside the RML doc is a strong point. Following Open World, the presence of non-R2RML triples doesn't make those documents non-R2RML.

I think the test driver could easily append a fixed ttl that defines the Logical Source. My point is that if the 4 RML tools pass formal R2RML conformance, this will be a strong point for their wider adoption.

Re R2RMLTC0014b, R2RMLTC0014c, R2RMLTC0014d: I think that rr:inverseExpression plays no role in R2RML ETL but only in NO-ETL approaches (SQL->SPARQL translators).

RMLTC0020b deals with some anomalous or unexpected IRIs generated from a column "Name":

Why does it use rml:reference even for RDBMS? Eg RMLTC0020a uses rr:template.

andimou commented 5 years ago

I think the scope of RML is ETL but not NO-ETL so from that point of view that's ok.

I don't think that the scope of RML is ETL only, I think it's a coincidence or just easier to deal with the ETL case :)

But how would you see it being applied to RML (without limiting the scope to SQL-->SPARQL translation)?

So I guess it(RMLTC0020b)'s just an addition to the R2RML suite?

I think this existed in R2RML too: https://www.w3.org/2001/sw/rdb2rdf/test-cases/#R2RMLTC0020b

Why does it use rml:reference even for RDBMS? Eg RMLTC0020a uses rr:template

Well, if the template is built with a single reference to the input, e.g. "{Name}",

either using <> rr: template "{Name}"

or <> rml:reference "{Name}", rr:termType rr:IRI

is the same in the end, no?

VladimirAlexiev commented 4 years ago

@andimou