lutaml / expressir

Ruby parser for the ISO EXPRESS language
3 stars 3 forks source link

Failure to parse tail remarks `--` with a prefixed whitespace and without a space after the double dashes #132

Closed ronaldtse closed 11 months ago

ronaldtse commented 11 months ago
Screenshot 2023-12-07 at 1 07 02 PM Screenshot 2023-12-07 at 1 07 10 PM Screenshot 2023-12-07 at 1 07 26 PM

This is a valid tail remark:

SCHEMA a_schema;--remark

Another valid one:

 --valid remark
SCHEMA--valid remark

This schema fails validation due to two lines. However, it is supposed to pass.

geometric_model_schema.exp.zip

ronaldtse commented 11 months ago
require 'expressir'
require 'expressir/express/parser'
Expressir::Express::Parser.from_file('geometric_model_schema.exp')
=> see error raised
ronaldtse commented 11 months ago

Ping @maxirmx to see if you have time.

ronaldtse commented 11 months ago

This is currently causing iso-10303-srl 's parsing of this schema to fail.

maxirmx commented 11 months ago

Expressir supports remarks, including tail remarks and there is a test that covers this functionality

The schema geometric_model_schema.exp contains multiple remarks structured like

--"At least one item in the items set shall be a manifold_solid_brep entity or a mapped_item (see also WR10)." 

Because of quotes these are empty remarks with an invalid remark_ref (references to non-existing objects).

This does not cause a crash in most cases but if such malformed remark is placed inside entity definition it causes expressir crash

In geometric_model_schema.exp there is a single remark that matches all conditions and cannot be processed:

--"At least one item in the items set shall be a manifold_solid_brep entity or a mapped_item (see also WR10)." 

Parsing works if I move this remark outside of entity definition

maxirmx commented 11 months ago

I guess the root cause is the precedence. Definition of remarks is in the tail of grammar file (have lowest precedence) and it doed not seem to be correct.

ronaldtse commented 11 months ago

@maxirmx indeed the “mis-reference” of the remark tag is a problem, but the remark tag technically does not need to cause a crash?

It is important though to report that a remark tag points to an invalid object.

In any case, empty remarks should not fail.

And in any case, the space between the two hyphens and the first double quote should not need to be present.

maxirmx commented 11 months ago

This is the existing implementation (part 1)

This is remark body tagged by ref

--"ref"  body

This is remark "ref" body not tagged because a tag definition assumes no space between -- and "ref"

-- "ref"  body

IMHO this matches the grammar

maxirmx commented 11 months ago

This is the existing implementation (part 2)

maxirmx commented 11 months ago

Addressed by #133