isl / x3ml

X3ML Engine supports the data transformation which is part of the data provisioning and aggregation process.
Apache License 2.0
19 stars 7 forks source link

New Custom generator that hashes multiple parts #140

Closed ymark closed 4 years ago

ymark commented 4 years ago

Implement a new Custom generator that generates hashes or produces random UUIDs based on the contents of its arguments. The difference between this generator and the embedded functionality that produces hashes (shorten="true" described in https://github.com/isl/x3ml/blob/master/docs/x3ml-language.md#hashed-uris-with-templates) is that hashing or random UUID generation will be supported in several arguments.

ymark commented 4 years ago

The new generator has been implemented. The class implementing the new custom generator is gr.forth.MultiHashingGenerator

It uses two special suffix encodings for identifying which arguments will be hashed or for which a random UUID will be generated, namely _HASHED_CONTENTS and _RANDOM_UUID. The behaviour of the generator is the following:

An indicative definition of the custom generator is the following (more details are described below).

<generator name="MultiHashingGenerator" prefix="pref">
    <custom generatorClass="gr.forth.MultiHashingGenerator">
        <set-arg name="term"/>
        <set-arg name="term_HASHED_CONTENTS"/>
        <set-arg name="term_other"/>
        <set-arg name="term_RANDOM_UUID"/>
    </custom>
</generator>

The aforementioned definition contains 4 arguments. Before describing them, it is important to mention that the arguments are evaluated in order of appearance in the definition file (e.g. the generator policy file), and the separator between them is always the slash character ('/').

So using the following input

<PERSON>
    Yannis
</PERSON>

the following part of the mappings file

<namespace prefix="pref" uri="http://www.example.com/"/>
...
<source_node>//PERSON</source_node>
<target_node>
    <entity>
        <type>crm:E21_Person</type>
        <instance_generator name="MultiHashingGenerator">
            <arg name="term" type="constant">person</arg>
            <arg name="term_HASHED_CONTENTS" type="xpath">text()</arg>
            <arg name="term_other" type="constant">name</arg>
            <arg name="term_RANDOM_UUID" type="constant"></arg>
        </instance_generator>
    </entity>
</target_node>

It will generate a URI of the form:

http://www.example.com/person/F899139D-F5E1-3593-9643-1415E770C6DD/name/DED33742-B286-3519-81CD-603BCC78EE05

More examples can be found in the test resources: https://github.com/isl/x3ml/tree/master/src/test/resources/generators

The new custom generator will be bundled with X3ML Engine from release 1.9.4