RMLio / yarrrml-parser

A YARRRML parser library and CLI in Javascript
MIT License
43 stars 17 forks source link

str_split with output type iri only returns first element when a condition is used #146

Open SvenLieber opened 2 years ago

SvenLieber commented 2 years ago

Issue type: :bug: Bug

Description

Steps

Following YARRRML does not produce http://example.org/nonFiction

prefixes:
  ex: "http://example.org/"
  schema: "http://schema.org/"
  idlab-fn: "http://example.com/idlab/function/"
  grel: "http://users.ugent.be/~bjdmeest/function/grel.ttl#"

mappings:

  classification:
    sources:
      - access: "split-test-data.csv"
        referenceFormulation: csv
        delimiter: ','
    s: ex:book_$(id)
    po:
     - [a, schema:Book]
     - [schema:name, $(name)]
     - p: schema:about
       o:
         function: grel:string_split
         parameters:
           - [grel:valueParameter, $(genre)]
           - [grel:p_string_sep, ';']
         type: iri
       condition:
         function: idlab-fn:notEqual
         parameters:
           - [grel:valueParameter, $(genre)]
           - [grel:valueParameter2, ""]

Test data to reproduce

id,name,genre
1,My first book,http://example.org/history;http://example.org/nonFiction
2,My second book,http://example.org/history
3,My third book,

Result with missing http://example.org/nonFiction

@prefix ex: <http://example.org/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .

ex:book_1 a schema:Book;
  schema:about ex:history;
  schema:name "My first book" .

ex:book_2 a schema:Book;
  schema:about ex:history;
  schema:name "My second book" .

ex:book_3 a schema:Book;
  schema:name "My third book" .

In case the source data is not empty, no condition is needed and the mapping with split produces all values.

The RML produced looks like the following:

@prefix rr: <http://www.w3.org/ns/r2rml#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix fnml: <http://semweb.mmlab.be/ns/fnml#>.
@prefix fno: <https://w3id.org/function/ontology#>.
@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#>.
@prefix void: <http://rdfs.org/ns/void#>.
@prefix dc: <http://purl.org/dc/terms/>.
@prefix foaf: <http://xmlns.com/foaf/0.1/>.
@prefix rml: <http://semweb.mmlab.be/ns/rml#>.
@prefix ql: <http://semweb.mmlab.be/ns/ql#>.
@prefix : <http://mapping.example.com/>.
@prefix ex: <http://example.org/>.
@prefix schema: <http://schema.org/>.
@prefix idlab-fn: <http://example.com/idlab/function/>.
@prefix grel: <http://users.ugent.be/~bjdmeest/function/grel.ttl#>.

:rules_000 a void:Dataset;
    void:exampleResource :map_classification_000.
:map_classification_000 rml:logicalSource :source_000.
:source_000 a rml:LogicalSource;
    rml:source "split-test-data.csv";
    rml:referenceFormulation ql:CSV.
:map_classification_000 a rr:TriplesMap;
    rdfs:label "classification".
:s_000 a rr:SubjectMap.
:map_classification_000 rr:subjectMap :s_000.
:s_000 rr:template "http://example.org/book_{id}".
:pom_000 a rr:PredicateObjectMap.
:map_classification_000 rr:predicateObjectMap :pom_000.
:pm_000 a rr:PredicateMap.
:pom_000 rr:predicateMap :pm_000.
:pm_000 rr:constant rdf:type.
:pom_000 rr:objectMap :om_000.
:om_000 a rr:ObjectMap;
    rr:constant "http://schema.org/Book";
    rr:termType rr:IRI.
:pom_001 a rr:PredicateObjectMap.
:map_classification_000 rr:predicateObjectMap :pom_001.
:pm_001 a rr:PredicateMap.
:pom_001 rr:predicateMap :pm_001.
:pm_001 rr:constant schema:name.
:pom_001 rr:objectMap :om_001.
:om_001 a rr:ObjectMap;
    rml:reference "name";
    rr:termType rr:Literal.
:pom_002 a rr:PredicateObjectMap.
:map_classification_000 rr:predicateObjectMap :pom_002.
:pm_002 a rr:PredicateMap.
:pom_002 rr:predicateMap :pm_002.
:pm_002 rr:constant schema:about.
:pom_002 rr:objectMap :om_002.
:om_002 a fnml:FunctionTermMap;
    rr:termType rr:IRI;
    fnml:functionValue :fn_000.
:fn_000 rml:logicalSource :source_000;
    rr:predicateObjectMap :pomexec_000.
:pomexec_000 rr:predicateMap :pmexec_000.
:pmexec_000 rr:constant fno:executes.
:pomexec_000 rr:objectMap :omexec_000.
:omexec_000 rr:constant "http://example.com/idlab/function/trueCondition";
    rr:termType rr:IRI.
:fn_000 rr:predicateObjectMap :pom_003.
:pom_003 a rr:PredicateObjectMap;
    rr:predicateMap :pm_003.
:pm_003 a rr:PredicateMap;
    rr:constant idlab-fn:strBoolean.
:pom_003 rr:objectMap :om_003.
:om_003 a rr:ObjectMap, fnml:FunctionTermMap;
    fnml:functionValue :fn_001.
:fn_001 rml:logicalSource :source_000;
    rr:predicateObjectMap :pomexec_001.
:pomexec_001 rr:predicateMap :pmexec_001.
:pmexec_001 rr:constant fno:executes.
:pomexec_001 rr:objectMap :omexec_001.
:omexec_001 rr:constant "http://example.com/idlab/function/notEqual";
    rr:termType rr:IRI.
:fn_001 rr:predicateObjectMap :pom_004.
:pom_004 a rr:PredicateObjectMap;
    rr:predicateMap :pm_004.
:pm_004 a rr:PredicateMap;
    rr:constant grel:valueParameter.
:pom_004 rr:objectMap :om_004.
:om_004 a rr:ObjectMap;
    rml:reference "genre";
    rr:termType rr:Literal.
:fn_001 rr:predicateObjectMap :pom_005.
:pom_005 a rr:PredicateObjectMap;
    rr:predicateMap :pm_005.
:pm_005 a rr:PredicateMap;
    rr:constant grel:valueParameter2.
:pom_005 rr:objectMap :om_005.
:om_005 a rr:ObjectMap;
    rr:constant "";
    rr:termType rr:Literal.
:fn_000 rr:predicateObjectMap :pom_006.
:pom_006 a rr:PredicateObjectMap;
    rr:predicateMap :pm_006.
:pm_006 a rr:PredicateMap;
    rr:constant idlab-fn:str.
:pom_006 rr:objectMap :om_006.
:om_006 a rr:ObjectMap, fnml:FunctionTermMap;
    rr:termType rr:IRI;
    fnml:functionValue :fn_002.
:fn_002 rml:logicalSource :source_000;
    rr:predicateObjectMap :pomexec_002.
:pomexec_002 rr:predicateMap :pmexec_002.
:pmexec_002 rr:constant fno:executes.
:pomexec_002 rr:objectMap :omexec_002.
:omexec_002 rr:constant "http://users.ugent.be/~bjdmeest/function/grel.ttl#string_split";
    rr:termType rr:IRI.
:fn_002 rr:predicateObjectMap :pom_007.
:pom_007 a rr:PredicateObjectMap;
    rr:predicateMap :pm_007.
:pm_007 a rr:PredicateMap;
    rr:constant grel:valueParameter.
:pom_007 rr:objectMap :om_007.
:om_007 a rr:ObjectMap;
    rml:reference "genre";
    rr:termType rr:Literal.
:fn_002 rr:predicateObjectMap :pom_008.
:pom_008 a rr:PredicateObjectMap;
    rr:predicateMap :pm_008.
:pm_008 a rr:PredicateMap;
    rr:constant grel:p_string_sep.
:pom_008 rr:objectMap :om_008.
:om_008 a rr:ObjectMap;
    rr:constant ";";
    rr:termType rr:Literal.

Environment

pheyvaer commented 2 years ago

@SvenLieber I will have a look at it. Is this urgent?

SvenLieber commented 2 years ago

Hi @pheyvaer thanks! I would say medium priority, but given my late answer probably even low. For now I use a workaround in such cases with some pre-processing and an additional mapping avoding str_split.

Instead of one mapping with the single input

id,prop1,prop2
myID,val1,val2.1;val2.2;val2.3

to create myID prop1 val1 and myID prop2 val2.1, myID prop2 val2.2 and myID prop2 val2.3,

I have one mapping for all "regular" columns (i.e. myID prop1 val1) and create a separate input like this

id,prop2
myID,val2.1
myID,val2.2
myID,val2.3

with a separate mapping where I do not have to use the str_split function