carml / carml

A pretty sweet RML engine, for RDF.
MIT License
100 stars 20 forks source link

LogicalSource For Root Node Ignored When Mapping XML With Multiple Logical Sources #163

Closed tobiasschweizer closed 2 years ago

tobiasschweizer commented 2 years ago

Hi there

I have an XML file that I map using two logical sources with different iterators: one from the root element and one from a nested element.

test.xml: test.xml.txt

Mapping:

PREFIX rr: <http://www.w3.org/ns/r2rml#>
PREFIX rml: <http://semweb.mmlab.be/ns/rml#>
PREFIX ql: <http://semweb.mmlab.be/ns/ql#>
PREFIX carml: <http://carml.taxonic.com/carml/>
PREFIX dct: <http://purl.org/dc/terms/>
PREFIX premis: <http://id.loc.gov/vocabulary/preservation/>
PREFIX schema: <http://schema.org/>
@base <http://example.com/ns#>.

<#projectLogicalSource> a rml:BaseSource ;
    rml:source "test.xml";
    rml:referenceFormulation ql:XPath ;
    rml:iterator "/project" .

<#organizationLogicalSource> a rml:BaseSource ;
    rml:source "test.xml";
    rml:referenceFormulation ql:XPath ;
    rml:iterator "/project/relations/associations/organization" .

<#project> a rr:TriplesMap ;
    rml:logicalSource <#projectLogicalSource> ;

    rr:subjectMap [
        rr:template "https://data.connectome.ch/project/{id}" ;
        rr:class schema:ResearchProject ;
    ] ;

    rr:predicateObjectMap [
        rr:predicate schema:name ;
        rr:objectMap [
            rml:reference "title" ;
        ];
    ] .

<#organization> a rr:TriplesMap ;
    rml:logicalSource <#organizationLogicalSource> ;

    rr:subjectMap [
        rr:template "https://data.connectome.ch/organization/{id}" ;
        rr:class schema:Organization ;
    ] ;

    rr:predicateObjectMap [
        rr:predicate schema:name ;
        rr:objectMap [
            rml:reference "legalName" ;
        ];
    ] .

I tried the same mapping with CARML (0.4.1, using the JAR version) and RML Mapper.

CARML output:

<https://data.connectome.ch/organization/999654744> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999654744> <http://schema.org/name> "REGION HOVEDSTADEN" .
<https://data.connectome.ch/organization/999978433> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999978433> <http://schema.org/name> "LUDWIG-MAXIMILIANS-UNIVERSITAET MUENCHEN" .
<https://data.connectome.ch/organization/999844670> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999844670> <http://schema.org/name> "OULUN YLIOPISTO" .
<https://data.connectome.ch/organization/999993468> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999993468> <http://schema.org/name> "IMPERIAL COLLEGE OF SCIENCE TECHNOLOGY AND MEDICINE" .
<https://data.connectome.ch/organization/963412440> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/963412440> <http://schema.org/name> "ABBOTT LABORATORIES SA" .
<https://data.connectome.ch/organization/999994729> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999994729> <http://schema.org/name> "HELMHOLTZ ZENTRUM MUENCHEN DEUTSCHES FORSCHUNGSZENTRUM FUER GESUNDHEIT UND UMWELT GMBH\n                " .
<https://data.connectome.ch/organization/999882015> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999882015> <http://schema.org/name> "UNIVERSIDAD DE GRANADA" .
<https://data.connectome.ch/organization/998303340> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/998303340> <http://schema.org/name> "SAMFUNDET FOLKHALSAN I SVENSKA FINLAND RF" .
<https://data.connectome.ch/organization/999749610> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999749610> <http://schema.org/name> "BRUNEL UNIVERSITY LONDON" .
<https://data.connectome.ch/organization/999988424> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999988424> <http://schema.org/name> "ERASMUS UNIVERSITAIR MEDISCH CENTRUM ROTTERDAM" .
<https://data.connectome.ch/organization/999975620> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999975620> <http://schema.org/name> "UNIVERSITY COLLEGE LONDON" .
<https://data.connectome.ch/organization/998732274> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/998732274> <http://schema.org/name> "ACADEMISCH MEDISCH CENTRUM BIJ DE UNIVERSITEIT VAN AMSTERDAM" .
<https://data.connectome.ch/organization/999744954> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/999744954> <http://schema.org/name> "BETA TECHNOLOGY LTD" .
<https://data.connectome.ch/organization/968960452> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization> .
<https://data.connectome.ch/organization/968960452> <http://schema.org/name> "LABORATORIOS ORDESA SL" .

RML Mapper output:

<https://data.connectome.ch/organization/999654744> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999654744> <http://schema.org/name> "REGION HOVEDSTADEN".
<https://data.connectome.ch/organization/999978433> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999978433> <http://schema.org/name> "LUDWIG-MAXIMILIANS-UNIVERSITAET MUENCHEN".
<https://data.connectome.ch/organization/999844670> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999844670> <http://schema.org/name> "OULUN YLIOPISTO".
<https://data.connectome.ch/organization/999993468> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999993468> <http://schema.org/name> "IMPERIAL COLLEGE OF SCIENCE TECHNOLOGY AND MEDICINE".
<https://data.connectome.ch/organization/963412440> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/963412440> <http://schema.org/name> "ABBOTT LABORATORIES SA".
<https://data.connectome.ch/organization/999994729> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999994729> <http://schema.org/name> "HELMHOLTZ ZENTRUM MUENCHEN DEUTSCHES FORSCHUNGSZENTRUM FUER GESUNDHEIT UND UMWELT GMBH\n                ".
<https://data.connectome.ch/organization/999882015> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999882015> <http://schema.org/name> "UNIVERSIDAD DE GRANADA".
<https://data.connectome.ch/organization/998303340> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/998303340> <http://schema.org/name> "SAMFUNDET FOLKHALSAN I SVENSKA FINLAND RF".
<https://data.connectome.ch/organization/999749610> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999749610> <http://schema.org/name> "BRUNEL UNIVERSITY LONDON".
<https://data.connectome.ch/organization/999988424> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999988424> <http://schema.org/name> "ERASMUS UNIVERSITAIR MEDISCH CENTRUM ROTTERDAM".
<https://data.connectome.ch/organization/999975620> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999975620> <http://schema.org/name> "UNIVERSITY COLLEGE LONDON".
<https://data.connectome.ch/organization/998732274> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/998732274> <http://schema.org/name> "ACADEMISCH MEDISCH CENTRUM BIJ DE UNIVERSITEIT VAN AMSTERDAM".
<https://data.connectome.ch/organization/999744954> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/999744954> <http://schema.org/name> "BETA TECHNOLOGY LTD".
<https://data.connectome.ch/organization/968960452> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Organization>.
<https://data.connectome.ch/organization/968960452> <http://schema.org/name> "LABORATORIOS ORDESA SL".
<https://data.connectome.ch/project/633595> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/ResearchProject>.
<https://data.connectome.ch/project/633595> <http://schema.org/name> "Understanding the dynamic determinants of glucose homeostasis and social capability to promote Healthy and\n        active aging\n    ".

The difference is that RML Mapper generates two additional statements for the <#project>.

No RDF statements are written for <#project> unless I remove the part for <#organization>. Do I miss something obvious? The two logical sources are not connected by a join or anything like that so I am a bit confused.

Thanks for any support on that.

pmaria commented 2 years ago

@tobiasschweizer found the issue. The xpath implementation was prematurely terminating. So before the closing tag was reached to match the root expression, evaluation would be marked as complete. Thanks for noticing this one.

tobiasschweizer commented 2 years ago

I really did not want to interrupt your holiday, sorry ...

Great that you found the bug and fixed it, thanks a lot! Let me know if I can help integrating the fix into CARML Jar, but let's do that once you're back :-)

pmaria commented 2 years ago

no worries :)

tobiasschweizer commented 2 years ago

I've locally updated my carml-jar with carml 0.4.2 and it works now as expected! :-)