herminiogg / ShExML

A heterogeneous data mapping language based on Shape Expressions
http://shexml.herminiogarcia.com
MIT License
15 stars 2 forks source link

White space in JSON field #98

Closed luigi-asprino closed 2 years ago

luigi-asprino commented 2 years ago

Hi,

I'm using ShExML for transforming a JSON into RDF.

An example of the input is the following:


[
  {
    "Inventario": "1110/B",
    "Autore": "Ignoto",
    "Ambito culturale": "ASs",
    "Datazione": "inizio XVI secolo",
    "Titolo-soggetto": "Abramo con tre angeli",
    "Materiali": "bronzo",
    "Immagine": "http://93.62.170.226/foto/1110_B.jpg",
    "lsreferenceby": "http://www.palazzomadamatorino.it/it/node/24055"
  }
]

As you can see there is the filed "Ambito culturale" which contains a white space. I'm trying to generate the RDF dataset with the following transformation

PREFIX ex: <http://example.com/>
SOURCE src <file:///Users/lgu/workspace/spice/CogComplexityAndPerformaceEvaluation/experiment/data/test.json>

ITERATOR artwork <jsonpath: $[*]> {
    FIELD id <Inventario>
    FIELD autore <Autore>
    FIELD datazione <Datazione>
    FIELD ac <Ambito culturale>
}

EXPRESSION aw <src.artwork>

:Artworks _:[aw.id] {
    ex:Inventario [aw.id] ;
    ex:Datazione [aw.datazione] ;
    ex:Autore [aw.autore] ;
    ex:Ambito_Culturale [aw.ac] ;
}

but I get this error


line 8:19 extraneous input 'culturale' expecting GREATER_SYMBOL_QUERY

I suppose that the field declaration in the iterator is not correct because of the white space. Should the white spaces be escaped or replaced with a special character or percent encoding? I couldn't find an equivalent example in the documentation.

herminiogg commented 2 years ago

Hi,

Thank you for your report!

I am afraid this is an error in the ShExML grammar that is not allowing white spaces in JSONPath expressions. I will try to fix it in the next release.

herminiogg commented 2 years ago

Hello,

I have just released the ShExML v0.2.6 which solves, among other things, this specific issue.

Please find it here: https://github.com/herminiogg/ShExML/releases/tag/v0.2.6

Best regards, Herminio García

luigi-asprino commented 2 years ago

great!!! thank you!

herminiogg commented 2 years ago

Hello @luigi-asprino,

Yesterday, I forgot to told you about the details of the fix. Namely, it is not possible to use dot based queries for attributes with white spaces as the underlying JSONPath library does not accept it. Instead you should use the bracket notation (e.g., ['Ambito culturale']). You can see it in action in this test file: https://github.com/herminiogg/ShExML/blob/master/src/test/scala-2.12/es/weso/shexml/FilmsAlt.scala

Best regards, Herminio García