RMLio / yarrrml-parser

A YARRRML parser library and CLI in Javascript
MIT License
41 stars 17 forks source link

support yaml input for sequence in sequence (block style), _compact version_ #163

Open mvanbrab opened 2 years ago

mvanbrab commented 2 years ago

Issue type: :unicorn: Feature

Description

Currently, the yarrrml parser supports sequence in sequence (block style). See files ok*.yarrrml.yml in attachment. Excerpt:

      -
        - voc:v1
        - $(v1)
      -
        - voc:v2
        - $(v2)

However, the compact version of it fails. See files nok*.yarrrml.yml in attachment. Excerpt;

      - - voc:v1
        - $(v1)
      - - voc:v2
        - $(v2)

It would be nice if this compact form were supported as well.

Why it is useful

In a project, we're writing the YARRRML file with a Python script. The well known Python library PyYAML writes sequence in a sequence in block style, using the compact version...

(For information: we work around currently using a less well known alternative Python library ruamel.yaml).

Existing features it breaks

None, as far as I know.

Attached archive file

File 1.zip contains evidence and test cases.

pheyvaer commented 2 years ago

Do you get an error when using the compact form?

mvanbrab commented 2 years ago

Yes, and the error depends on the place in the input. Sometimes the YARRRML parser outputs a message, sometimes it's the RML mapper (if it receives invalid RML from the YARRRRML parser). The attachement I provided covers the cases that I experienced and the log files show the errors.

1) In case the sequence in sequence is in a po section (nok1.yarrrml.yml), the YARRRML parser skips the input with a warning and generates a RML file (rml/nok1.rml.ttl) with nothing to do for the RML mapper, who outputs an empty file (data/nok1.ttl). File nok1.log:

===== 1. yarrrml-parser nok1.yarrrml.yml --> ./rml/nok1.rml.ttl
mapping "nok1": po with predicate(s) "undefined" does not have an object defined. Skipping.
mapping "nok1": po with predicate(s) "undefined" does not have an object defined. Skipping.
===== 2. rmlmapper ./rml/nok1.rml.ttl --> ./data/nok1.ttl

2) In case the sequence in sequence is in a function's parameters, (nok2.yarrrml.yml), the YARRRML parser does not write a message but generates a RML file (rml/nok2.rml.ttl) with invalid contents on which the RML mapper reports an error and does not output a result file (data/nok2.ttl is missing). File nok2.log:

===== 1. yarrrml-parser nok2.yarrrml.yml --> ./rml/nok2.rml.ttl
===== 2. rmlmapper ./rml/nok2.rml.ttl --> ./data/nok2.ttl
13:42:07.709 [main] ERROR be.ugent.rml.cli.Main               .main(205) - Unable to parse mapping rules as Turtle. Does the file exist and is it valid Turtle?
org.eclipse.rdf4j.rio.RDFParseException: Not a valid (absolute) IRI: /undefined [line 49]
    at org.eclipse.rdf4j.rio.helpers.RDFParserHelper.reportFatalError(RDFParserHelper.java:366)
    at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.reportFatalError(AbstractRDFParser.java:750)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.reportFatalError(TurtleParser.java:1313)
    at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.createURI(AbstractRDFParser.java:407)
    at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.resolveURI(AbstractRDFParser.java:385)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseURI(TurtleParser.java:943)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseValue(TurtleParser.java:575)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObject(TurtleParser.java:461)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseObjectList(TurtleParser.java:389)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parsePredicateObjectList(TurtleParser.java:384)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseTriples(TurtleParser.java:349)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parseStatement(TurtleParser.java:216)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:178)
    at org.eclipse.rdf4j.rio.turtle.TurtleParser.parse(TurtleParser.java:130)
    at be.ugent.rml.store.RDF4JStore.read(RDF4JStore.java:112)
    at be.ugent.rml.cli.Main.main(Main.java:202)
    at be.ugent.rml.cli.Main.main(Main.java:43)
Caused by: java.lang.IllegalArgumentException: Not a valid (absolute) IRI: /undefined
    at org.eclipse.rdf4j.model.impl.SimpleIRI.setIRIString(SimpleIRI.java:74)
    at org.eclipse.rdf4j.model.impl.SimpleIRI.<init>(SimpleIRI.java:63)
    at org.eclipse.rdf4j.model.impl.AbstractValueFactory.createIRI(AbstractValueFactory.java:86)
    at org.eclipse.rdf4j.rio.helpers.AbstractRDFParser.createURI(AbstractRDFParser.java:405)
    ... 13 common frames omitted

More information:

echo "===== 2. rmlmapper ${FILE_RML} --> ${FILE_DATA}"

Add -v[v[v]] option for debugging, in case of trouble...

java -jar $RML_MAPPER -m ${FILE_RML} -o ${FILE_DATA} -s turtle



- and the versions (all brand new):
  - YARRRML parser v1.3.5
  - RML Mapper v5.0.0