RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
146 stars 61 forks source link

JoinCondition not working properly when working with references #186

Open johndoe888 opened 2 years ago

johndoe888 commented 2 years ago

In my XML shown below there are references from /A1/B2/C1REF/@pid to /A1/B1/C1/@id

<?xml version="1.0" encoding="UTF-8"?>

<A1 name="A1">
  <B1 name="B1">
    <C1 id="C11" name="C11"/>
    <C1 id="C12" name="C12"/>
  </B1>
  <B2 name="B21">
    <C1REF pid="C11"/>
    <C1REF pid="C12"/>
  </B2>
  <B2 name="B22">
    <C1REF pid="C12"/>
  </B2>
</A1>

In my mapping the references are added with a join condition (last paragraph) :

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ex: <http://example.com/ns#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix transit: <http://vocab.org/transit/terms/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.

@base <http://example.com/data/>.

<#A1> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "./test.xml" ;
    rml:iterator "/A1";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template "http://example.com/data/{/A1/@name}";
    rr:class ex:A1
  ];

  rr:predicateObjectMap [
    rr:predicate ex:contains;
    rr:objectMap [
      rr:parentTriplesMap <#B1>;
    ];
  ].

<#B1> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "./test.xml" ;
    rml:iterator "/A1/B1";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template "http://example.com/data/{/A1/@name}/{@name}";
    rr:class ex:B1
  ];

  rr:predicateObjectMap [
    rr:predicate ex:contains;
    rr:objectMap [
      rr:parentTriplesMap <#C1>;
    ];
  ].

<#C1> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "./test.xml" ;
    rml:iterator "/A1/B1/C1";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template "http://example.com/data/{/A1/@name}/{../@name}/{@name}";
    rr:class ex:C1
  ].

<#B2> a rr:TriplesMap;
  rml:logicalSource [
    rml:source "./test.xml" ;
    rml:iterator "/A1/B2";
    rml:referenceFormulation ql:XPath;
  ];

  rr:subjectMap [
    rr:template "http://example.com/data/{/A1/@name}/{@name}";
    rr:class ex:B2
  ];

  rr:predicateObjectMap [
    rr:predicate ex:contains;
    rr:objectMap [
      rr:parentTriplesMap <#C1>;
      rr:joinCondition [
        rr:parent "@id";
        rr:child "C1REF/@pid";
      ];
    ];
  ].

This does not work if there are multiple references beneath a <B1> tag, because the equal condition executed in the background will look like the following (snippet). It checks for equality of [C11] and [C12] with [C11, C12], which of course does not work.

    Line 1075: 15:26:04.741 [main] DEBUG b.u.i.k.functions.agent.AgentImpl   .execute(34) - Executing function 'http://example.com/idlab/function/equal' with arguments '('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter' -> '[C11]')('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter2' -> '[C11, C12]')'
    Line 1084: 15:26:04.741 [main] DEBUG b.u.i.k.functions.agent.AgentImpl   .execute(34) - Executing function 'http://example.com/idlab/function/equal' with arguments '('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter' -> '[C12]')('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter2' -> '[C11, C12]')'
    Line 1087: 15:26:04.741 [main] DEBUG b.u.i.k.functions.agent.AgentImpl   .execute(34) - Executing function 'http://example.com/idlab/function/equal' with arguments '('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter' -> '[C11]')('http://users.ugent.be/~bjdmeest/function/grel.ttl#valueParameter2' -> '[C12]')'

Is this a bug? How can I go around this behavior if this is not a bug?