isl / x3ml

X3ML Engine supports the data transformation which is part of the data provisioning and aggregation process.
Apache License 2.0
20 stars 7 forks source link

attributes with namespace are not matched #143

Closed robcast closed 3 years ago

robcast commented 3 years ago

It seems that it is not possible to match xml attributes that have a namespace (other than default).

I have a METS XML file containing

<mets:file ID="FILE_0001_DERIVATE" MIMETYPE="image/jpeg">
  <mets:FLocat LOCTYPE="URL" xlink:href="https://example.com/something/derivate/0001.jpg"/>
</mets:file>

I can not access the xlink:href attribute using XPath like @xlink:href or attribute::xlink:href. After trying for many hours I found that it works if I explicitly ignore the namespace using attribute::*[local-name()='href'].

ymark commented 3 years ago

Nope, you can map attributes with or without namespace without any problems. I've created a small mapping based on your example. See below

XML input

<root xmlns:xlink="http://www.w3.org/1999/xlink/" xmlns:mets="http://www.loc.gov/METS/v2/">
    <mets:file ID="FILE_0001_DERIVATE" MIMETYPE="image/jpeg">
        <mets:FLocat LOCTYPE="URL" xlink:href="https://example.com/something/derivate/0001.jpg"/>
    </mets:file>
</root>

X3ML Mappings

<x3ml version="1.0" source_type="xpath">
    <namespaces>
        <namespace prefix="onto" uri="http://www.example.com/ontology/"/>
        <namespace prefix="rdf" uri="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
        <namespace prefix="rdfs" uri="http://www.w3.org/2000/01/rdf-schema#"/>
        <namespace prefix="mets" uri="http://www.loc.gov/METS/v2/"/>
        <namespace prefix="xlink" uri="http://www.w3.org/1999/xlink/"/>
    </namespaces>
    <mappings>
        <mapping>
            <domain>
                <source_node>/root/mets:file</source_node>
                <target_node>
                    <entity>
                        <type>onto:File_Resource</type>
                        <instance_generator name="UUID"/>
                <label_generator name="Literal">
                    <arg name="text" type="xpath">@ID</arg>
                </label_generator>
                    </entity>
                </target_node>
            </domain>
            <link>
                <path>
                    <source_relation><relation>mets:FLocat</relation></source_relation>
                    <target_relation>
                        <relationship>onto:has_location</relationship>
                    </target_relation>
                </path>
                <range>
                    <source_node>mets:FLocat</source_node>
                    <target_node>
                        <entity>
                            <type>onto:File_Location</type>
                            <instance_generator name="URIorUUID">
                    <arg name="text">@xlink:href</arg>
                </instance_generator>
                        </entity>
                    </target_node>
                </range>
            </link>
        </mapping>
    </mappings>
</x3ml>

These generate the following output in RDF/XML

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:onto="http://www.example.com/ontology/"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
  <onto:File_Resource rdf:about="urn:uuid:a63c653b-2767-4eee-8bfe-394075bd3c1e">
    <onto:has_location>
      <onto:File_Location rdf:about="https://example.com/something/derivate/0001.jpg"/>
    </onto:has_location>
    <rdfs:label>FILE_0001_DERIVATE</rdfs:label>
  </onto:File_Resource>
</rdf:RDF>

Are you sure you declare all the XML namespaces you want to use (e.g. in your case xlink) under the namespaces section in X3ML ?

robcast commented 3 years ago

Thanks @ymark I found the problem on my side!

I tried your example and it works and then I tried to us my XML file and it didn't work and in the end I found that I have a tiny difference in the namespace declaration of xlink in my XML file

xmlns:xlink="http://www.w3.org/1999/xlink"

and my X3ML mapping

<namespace prefix="xlink" uri="http://www.w3.org/1999/xlink/"/>

when I remove the slash at the end of the namespace URI in my mapping it starts to work...