isl / x3ml

X3ML Engine supports the data transformation which is part of the data provisioning and aggregation process.
Apache License 2.0
19 stars 7 forks source link

New generator that re-uses an existing URI, otherwise it will create one #129

Closed ymark closed 6 years ago

ymark commented 6 years ago

Design and develop a new generator that will work in a similar manner with URIorUUID. More specifically it will reuse an existing URI from the input, if the URI does not exist then it will construct one using a specific pattern.

For example for the given input:

<ROOT>
    <COIN>
        <ID>http://coin_1</ID>
    <COIN>
    <COIN>
        <ID>2</ID>
    <COIN>
</ROOT>

The new generator will be able to re-use or construct a URI using the XPATH //ROOT/COIN/ID/text()

The generated URIs could be:

ymark commented 6 years ago

The new generator has been implemented. The class implementing the new functionality is gr.forth.UriExistingOrNew.

The generator has the following mandatory arguments:

It becomes evident that uri is used for retrieving the value of an existing URI, while text# and _uriseparator# are used for constructing a new one. The latter are combined using the following scheme: namespace_prefix+text1+uri_separator1+text2+uri_separator2+...

Important: The number of text arguments should be equal to the number of uri_separator_arguments.

A new URI will be generated as soon as there is not an existing valid URI.

The namespace to be used with all the new URIs can be declared in the generation-policy file, within the declaration of the generator (under the attribute prefix).

The following example demonstrates the use of the new generator.

Input File:

<?xml version="1.0" encoding="UTF-8"?>
<dataroot>
    <COIN>
        <ID>http://ID-100</ID>
        <COUNTRY_ID>C1</COUNTRY_ID>
    </COIN> 
    <COIN>
        <ID>200</ID>
        <COUNTRY_ID>C2</COUNTRY_ID>
    </COIN>
    <COIN>
        <ID>300</ID>
        <COUNTRY_ID>C3</COUNTRY_ID>
    </COIN>
</dataroot>

Generator-Policy file

<?xml version="1.0" encoding="UTF-8"?>
<generator_policy>
    <generator name="UriExistingOrNew" prefix="ex">
        <custom generatorClass="gr.forth.UriExistingOrNew">
            <set-arg name="uri" type="xpath"/>
        <set-arg name="text1"/>
        <set-arg name="uri_separator1" type="constant"/>
        <set-arg name="text2"/>
        <set-arg name="uri_separator2" type="constant"/>
        </custom>
    </generator>
</generator_policy>

X3ML Mappings

<?xml version="1.0" encoding="UTF-8"?>
<x3ml version="1.0" source_type="xpath">
    <namespaces>
        <namespace prefix="crm" uri="http://www.cidoc-crm.org/cidoc-crm/"/>
        <namespace prefix="rdf" uri="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
        <namespace prefix="rdfs" uri="http://www.w3.org/2000/01/rdf-schema#"/>
    <namespace prefix="ex" uri="http://example/"/>
    </namespaces>
    <mappings>
        <mapping>
            <domain>
                <source_node>//COIN</source_node>
                <target_node>
                    <entity>
                        <type>crm:E22_Man-Made_Object</type>
                        <instance_generator name="UriExistingOrNew">
                <arg name="uri" type="xpath">ID/text()</arg>
                <arg name="text1" type="xpath">ID/text()</arg>
                <arg name="uri_separator1" type="constant">/</arg>
                <arg name="text2" type="xpath">COUNTRY_ID/text()</arg>
                <arg name="uri_separator2" type="constant"></arg>
            </instance_generator>
                    </entity>
                </target_node>
            </domain>
        </mapping>
    </mappings>
</x3ml>

RDF Output:

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:ex="http://example/"
    xmlns:crm="http://www.cidoc-crm.org/cidoc-crm/"
    xmlns:skos="http://www.w3.org/2004/02/skos/core#"
    xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
  <crm:E22_Man-Made_Object rdf:about="http://example/300/C3"/>
  <crm:E22_Man-Made_Object rdf:about="http://ID-100"/>
  <crm:E22_Man-Made_Object rdf:about="http://example/200/C2"/>
</rdf:RDF>

The new functionality will be made available from version 1.9.1 onwards.