herminiogg / ShExML

A heterogeneous data mapping language based on Shape Expressions
http://shexml.herminiogarcia.com
MIT License
15 stars 2 forks source link

ShExML parser does parse (and ignores) when ; are not stated #145

Open andrawaag opened 1 year ago

andrawaag commented 1 year ago

When I provided the parser an incorrect ShExML where I forgot to provide correct ";" at the end. Tiples were generated. I would expect an error message. The final two statements lacked the ";". The resulting triples were not generated

This was the case in the following section.

:subjectproperties _:[gbmSubjects.id] {
    schema:gender obo:[gbmSubjects.gender MATCHING genders] ;
    schema:birthdate [gbmSubjects.birthdate] ;
    foaf:age [gbmSubjects.Age]
    obo:RO_0000086 @:dead
}

This lead to the following results:

andra@Micelio-MacBook-Pro GBM % java -jar ShExML-assembly-0.3.3.jar --mapping=GBM1.shex
14:51:44.241 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Launching mapping
14:51:44.242 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Applying lexer to tokenize input mapping rules
14:51:44.256 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Parsing tokens from the lexer
14:51:44.263 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Creating the AST for the input mapping rules
14:51:44.307 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Building var table
14:51:44.493 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Generating shape :gbmPatient results with 5 predicate-object statements
14:51:44.552 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Generating shape :subjectproperties results with 4 predicate-object statements
14:51:44.553 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Generating shape :dead results with 2 predicate-object statements
14:51:44.555 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Expanded 2 predicate-object statements in 20 results
14:51:44.566 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Found 19 subjects from data
14:51:44.569 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Expanded 4 predicate-object statements in 76 results
14:51:44.577 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Found 19 subjects from data
14:51:44.577 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Expanded 5 predicate-object statements in 77 results
14:51:44.586 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - Found 19 subjects from data
14:51:44.588 INFO [main] [com.herminiogarcia.shexml.visitor.RDFGeneratorVisitor] - The mapping rules produced 209 triples
14:51:44.588 INFO [main] [com.herminiogarcia.shexml.MappingLauncher] - Converting output to Turtle
@prefix :      <https://www.example.org/gbm1/> .
@prefix schema: <https://schema.org/> .
@prefix dct:   <http://purl.org/dc/terms/> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix sio:   <http://semanticscience.org/resource/> .
@prefix obo:   <http://purl.obolibrary.org/obo/> .
@prefix foaf:  <http://xmlns.com/foaf/0.1/> .

:6      a               obo:NCIT_C16960 ;
        dct:identifier  6 ;
        sio:SIO_000066  "KAG" , "Ge 893*" ;
        sio:SIO_000223  _:b0 .

:10     a               obo:NCIT_C16960 ;
        dct:identifier  10 ;
        sio:SIO_000066  "OAG" , "Ge 941*" ;
        sio:SIO_000223  _:b1 .

After adding the correct ';', the results were as expected.

herminiogg commented 11 months ago

If I remember correctly this a tricky issue with grammars as you can omit the last one and this is perfectly legal. When omitting the last one it works fine and it generates all the triples. However, it is true, that if you omit more of them you can end up with less triples or none at all. As said this is kind of difficult to control with the grammar.

Either way, this could be solved with a semantic analysis and check of the AST, unfortunately this semantic analysis is something that I neglected since the very beginning as it tends to be too time consuming so I used the time to include further features to the engine. I will, nevertheless, study its addition for the future as it is also related with your other issue #147.

Thanks for reporting this!