lutaml / expressir

Ruby parser for the ISO EXPRESS language
3 stars 3 forks source link

Remark items within `SCHEMA` ... `END_SCHEMA` cause parsing differences in Expressir #108

Open manuelfuenmayor opened 2 years ago

manuelfuenmayor commented 2 years ago

In relation to https://github.com/metanorma/iso-10303-detached-docs/issues/82

Taking this EXPRESS syntax as example:

SCHEMA method_definition_schema;

REFERENCE FROM action-schema
    (action_method,
     action_method_relationship,
     action_relationship);

REFERENCE FROM document_schema
    (document,
     document_usage_constraint);

REFERENCE FROM effectivity_schema
    (effectivity);

REFERENCE FROM measure_schema
    (count_measure);

REFERENCE FROM support_resource_schema
    (label,
     text);

REFERENCE FROM process_property_schema
    (product_definition_process,
     property_process);

(*"method_definition_schema"
The subject of the *method_definition_schema* is the specification of the instructions required to perform a process. This part of ISO 10303 is applicable to all types of process definitions that can be represented in a discrete manner. This clause provides:

* composition structure of a process, based on a series of actions or potential actions;
* control structure for defining the order of execution of processes;
* method for identifying a document that defines a process;
* method for identification of process effectivity;
* structure for defining conditions that may alter order of completion of the process

NOTE: See <<iso10303-41>> for further information related to *action_method* and *action_method_relationship*.

The *method_definition_schema* represents the data in a process plan, but not the process and data that are required to develop the process plan. The *method_definition_schema* may be used in many contexts for process representation. A context is defined by an application resource or an application protocol.
*)

...

END_SCHEMA; -- method_definition_schema

The annotated content enabled by (*"method_definition_schema" method is not being recognized by the parser, because it is "inside" the EXPRESS syntax. If I'd move this content after END_SCHEMA; -- method_definition_schema, it would be recognized.

This doesn't occur with others methods though.

ronaldtse commented 1 year ago

Thank you @manuelfuenmayor , I've just run into this!

The problem is that the END_SCHEMA location affects where the remarks of the same schema name are located:

one_schema_before.exp

SCHEMA one_schema;

(*"one_schema"
Documentation for Schema 1
*)

END_SCHEMA;

one_schema_after.exp

SCHEMA one_schema;

END_SCHEMA;

(*"one_schema"
Documentation for Schema 1
*)

Parsing differences

irb(main):015> Expressir::Express::Parser.from_file('one_schema_before.exp')
=> 
#<Expressir::Model::Repository:0x000000010964f550
 @schemas=
  [#<Expressir::Model::Declarations::Schema:0x0000000109654a78
    @children_by_id=nil,
    @constants=[],
    @entities=[],
    @file="one_schema_before.exp",
    @functions=[],
    @id="one_schema",
    @interfaces=[],
    @parent=#<Expressir::Model::Repository:0x000000010964f550 ...>,
    @procedures=[],
    @remark_items=
     [#<Expressir::Model::Declarations::RemarkItem:0x000000010964f668
       @id="one_schema",
       @parent=#<Expressir::Model::Declarations::Schema:0x0000000109654a78 ...>,
       @remarks=["Documentation for Schema 1"]>],
    @remarks=[],
    @rules=[],
    @source=nil,
    @subtype_constraints=[],
    @types=[],
    @version=nil>]>
irb(main):016> Expressir::Express::Parser.from_file('one_schema_after.exp')
=> 
#<Expressir::Model::Repository:0x0000000104c7be38
 @children_by_id=
  {"one_schema"=>
    #<Expressir::Model::Declarations::Schema:0x0000000104c80438
     @constants=[],
     @entities=[],
     @file="one_schema_after.exp",
     @functions=[],
     @id="one_schema",
     @interfaces=[],
     @parent=#<Expressir::Model::Repository:0x0000000104c7be38 ...>,
     @procedures=[],
     @remark_items=[],
     @remarks=["Documentation for Schema 1"],
     @rules=[],
     @source=nil,
     @subtype_constraints=[],
     @types=[],
     @version=nil>},
 @schemas=
  [#<Expressir::Model::Declarations::Schema:0x0000000104c80438
    @constants=[],
    @entities=[],
    @file="one_schema_after.exp",
    @functions=[],
    @id="one_schema",
    @interfaces=[],
    @parent=#<Expressir::Model::Repository:0x0000000104c7be38 ...>,
    @procedures=[],
    @remark_items=[],
    @remarks=["Documentation for Schema 1"],
    @rules=[],
    @source=nil,
    @subtype_constraints=[],
    @types=[],
    @version=nil>]>
irb(main):017> 

Problems

Problem 1: location of remark item affects where the top-level remark is located

When the remark item of the schema level is outside of the SCHEMA ... END_SCHEMA;, the remark is accessed with:

(This is the desired behavior)

repo.schemas[0].remarks

When the remark item of the schema level is within the SCHEMA ... END_SCHEMA; construct, the remark is accessed via:

repo.schemas[0].remark_items["one_schema"].remarks

Problem 2: missing children_by_id when a remark item is located within the SCHEMA block

When the remark item of the schema level is outside of the SCHEMA ... END_SCHEMA;, we can do:

(This is the desired behavior)

# The pattern is `repo.children_by_id[schema_name]`
repo.children_by_id["one_schema"]

When the remark item of the schema level is within the SCHEMA ... END_SCHEMA; construct, the schema is only found via:

repo.schemas.select { |x| x.id == "one_schema" }
maxirmx commented 11 months ago

@ronaldtse @manuelfuenmayor

This is by design. I do not know wheather it is wrong or right but the tag works as path from the current scope

Below are the equivalent tag definitions:

one_schema_before.exp

SCHEMA one_schema;

(*""
Documentation for Schema 1
*)

END_SCHEMA;

one_schema_after.exp

SCHEMA one_schema;

END_SCHEMA;

(*"one_schema"
Documentation for Schema 1
*)

In the first case the remark is within "one_schema" scope, so using empty tag attaches the remark to self In the second case remark is within the scope of "one_schema" parent, so the tag shall be "one_schema"

It works very similar to XPath