RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
144 stars 61 forks source link

Hashing each object value #232

Open namedgraph opened 7 months ago

namedgraph commented 7 months ago

Hi. It's me again. I'm trying to combine the data structure from #230 and apply the hashing from #231.

[
  { "id": "parent_1", "children": [ "child_1A", "child_1B" ] },
  { "id": "parent_2", "children": [ "child_2A" ] }
]

I've used your suggested mapping but I want to hash the children URIs (subjects work, so only focusing on objects here). This is the mapping I'm testing in Matey:

prefixes:
  ex: http://example.org/

sources:
  parents:
    access: data.json
    referenceFormulation: jsonpath
    iterator: "$[*]"
  children:
    access: data.json
    referenceFormulation: jsonpath
    iterator: "$[*].children[*]"
mappings:
  Parent:
    sources: parents
    s: ex:$(id)
    po:
      - [ ex:label, $(id) ]
      - p: ex:child
        o:
          type: iri
          function: grel:array_join
          parameters:
            - [ grel:p_array_a, "ex:" ]
            - parameter: grel:p_array_a
              value:
                function: grel:string_md5
                parameters:
                  - [ grel:valueParameter, "$(children[*])" ]
  Child:
    sources: children
    s: ex:$(@)
    po:
      - [ ex:label, $(@) ]

The output I get:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ex: <http://example.org/> .

ex:parent_1 ex:label "parent_1" ;
    ex:child ex:16cf9a694c5ec6adab37c68c462a4470 .

ex:parent_2 ex:label "parent_2" ;
    ex:child ex:bc1e1df4ec88d9826ec785c4ff545f9e .

ex:child_1A ex:label "child_1A" .

ex:child_1B ex:label "child_1B" .

ex:child_2A ex:label "child_2A" .

Note that the cardinality of ex:child is incorrect, as without hashing it was:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix ex: <http://example.org/> .

ex:parent_1 ex:label "parent_1" ;
    ex:child ex:child_1A, ex:child_1B .

ex:parent_2 ex:label "parent_2" ;
    ex:child ex:child_2A .

ex:child_1A ex:label "child_1A" .

ex:child_1B ex:label "child_1B" .

ex:child_2A ex:label "child_2A" .

Attempting to execute the same kind of pattern with rml-mapper-java seems to fail:

12:03:07.577 [main] ERROR b.u.r.f.DynamicMultipleRecordsFunctionExecutor.execute(87) - Function 'http://users.ugent.be/~bjdmeest/function/grel.ttl#string_md5' failed to execute with Cannot invoke "Object.toString()" because "s" is null
namedgraph commented 7 months ago

@bjdmeest any advice?

bjdmeest commented 7 months ago

You found a bug in RMLMapper-JAVA! a nested function that takes an array as input apparently only outputs the first transformed element, not the entire array. I'm not sure how your final error with RMLMapper-JAVA came to be, but I can reproduce the first output locally. So we got a nice little test case, thanks for that. We'll put it on the planning to fix!

Disclaimer: as we're a research group and aren't getting paid for this kind of maintenance, we can't make any promises as to when this fix will be solved, but as always, we do our best 💪

namedgraph commented 7 months ago

Thanks for looking into it.