lutaml / expressir

Ruby parser for the ISO EXPRESS language
3 stars 2 forks source link

Formalize and develop parser for EXPRESS MIM mapping #88

Open ronaldtse opened 3 years ago

ronaldtse commented 3 years ago

MIM documents contain "reference path" mappings that relates to the EXPRESS schemas and their items.

Specifically:

Reference path: This section contains:

  • the reference path to its supertypes in the common resources, for each MIM element created within this part of ISO 10303;
  • the specification of the relationships between MIM elements, when the mapping of an application element requires to relate instances of several MIM entity data types. In such a case, each line in the reference path documents the role of a MIM element relative to the referring MIM element or to the next referred MIM element.

For example in ISO 10303-1001 (in the SMRL), it contains:

Screenshot 2021-03-30 at 2 16 51 PM
presentation_style_assignment
presentation_style_assignment.styles[i] ->
presentation_style_select

Is to be understood by these rules:

symbol definition
-> the attribute, whose name precedes the -> symbol, references the entity or select type whose name follows the -> symbol;
[i] the attribute, whose name precedes the [i] symbol, is an aggregate; any element of that aggregate is referred to;

So the element presentation_style_assignment.styles is an aggregate, and every item in presentation_style_assignment.styles[i] refers to the select type presentation_style_select.

The challenges are:

  1. This mapping specification is not machine-readable. It is only read by humans.
  2. We don't really know if the EXPRESS objects (e.g. presentation_style_select) referred exist or has changed based on the actual EXPRESS schemas.

What we need to do:

  1. Formalize the language, based on the specification described in "5.1" below.
  2. Create a parser to parse this language, link its nodes to Expressir, in order to perform validation with the actual EXPRESS schemas referred.

The full specification here.

5.1 Mapping specification

In the following, "Application element" designates any entity data type defined in Clause 4, any of its explicit attributes and any subtype constraint. "MIM element" designates any entity data type defined in Clause 5.2 or imported with a USE FROM statement, from another EXPRESS schema, any of its attributes and any subtype constraint defined in Clause 5.2 or imported with a USE FROM statement.

This clause contains the mapping specification that defines how each application element of this part of ISO 10303 (see Clause 4) maps to one or more MIM elements (see Clause 5.2).

The mapping for each application element is specified in a separate subclause below. The mapping specification of an attribute of an ARM entity is a subclause of the clause that contains the mapping specification of this entity. Each mapping specification subclause contains up to five elements.

Title: The clause title contains:

MIM element: This section contains, depending on the considered application element:

When the mapping of an application element involves more than one MIM element, each of these MIM elements is presented on a separate line in the mapping specification, enclosed between parentheses or brackets.

Source: This section contains:

This section is omitted when the keywords PATH or IDENTICAL MAPPING or NO MAPPING EXTENSION PROVIDED are used in the MIM element section.

Rules: This section contains the name of one or more global rules that apply to the population of the MIM entity data types listed in the MIM element section or in the reference path. When no rule applies, this section is omitted.

A reference to a global rule may be followed by a reference to the subclause in which the rule is defined.

Constraint: This section contains the name of one or more subtype constraints that apply to the population of the MIM entity data types listed in the MIM element section or in the reference path. When no subtype constraint applies, this section is omitted.

A reference to a subtype constraint may be followed by a reference to the subclause in which the subtype constraint is defined.

Reference path: This section contains:

For the expression of reference paths and of the constraints between MIM elements, the following notational conventions apply:

symbol definition
[] enclosed section constrains multiple MIM elements or sections of the reference path are required to satisfy an information requirement;
() enclosed section constrains multiple MIM elements or sections of the reference path are identified as alternatives within the mapping to satisfy an information requirement;
{} enclosed section constrains the reference path to satisfy an information requirement;
<> enclosed section constrains at one or more required reference path;
\|\| enclosed section constrains the supertype entity;
-> the attribute, whose name precedes the -> symbol, references the entity or select type whose name follows the -> symbol;
<- the entity or select type, whose name precedes the <- symbol, is referenced by the entity attribute whose name follows the <- symbol;
[i] the attribute, whose name precedes the [i] symbol, is an aggregate; any element of that aggregate is referred to;
[n] the attribute, whose name precedes the [n] symbol, is an ordered aggregate; member n of that aggregate is referred to;
=> the entity, whose name precedes the => symbol, is a supertype of the entity whose name follows the => symbol;
<= the entity, whose name precedes the <= symbol, is a subtype of the entity whose name follows the <= symbol;
= the string, select, or enumeration type is constrained to a choice or value;
\ the reference path expression continues on the next line;
* one or more instances of the relationship entity data type may be assembled in a relationship tree structure. The path between the relationship entity and the related entities, is enclosed with braces;
-- the text following is a comment or introduces a clause reference;
*> the select or enumeration type, whose name precedes the *> symbol, is extended into the select or enumeration type whose name follows the *> symbol;
<* the select or enumeration type, whose name precedes the <* symbol, is an extension of the select or enumeration type whose name follows the <* symbol;
!{} section enclosed by {} indicates a negative constraint placed on the mapping.
ronaldtse commented 3 years ago

Ping @TRThurman the above is based on our last discussion at WG 12, please let me know if this is accurate. Thanks!

TRThurman commented 3 years ago

There is a missing optional element from the list above. /MAPPING_OF/ which is a predefined template.

See following: The definition and use of mapping templates are not supported in the present version of the application modules. However, use of predefined templates /MAPPING_OF/, /SUBTYPE/, and /SUPERTYPE/ is supported.

TRThurman commented 3 years ago

The newline character is part of the syntax. However, the use of newline to separate this declaration is not enforced by the XSL scripts.

This: presentation_style_assignment presentation_style_assignment.styles[i] -> presentation_style_select

might also be encoded as: presentation_style_assignment presentation_style_assignment.styles[i] -> presentation_style_select

TRThurman commented 3 years ago

mim elements are usually repeated in the reference path when an attribute is specified This usually appears ... presentation_style_assignment presentation_style_assignment.styles[i] -> presentation_style_select


This might appear presentation_style_assignment.styles[i] -> presentation_style_select

TRThurman commented 3 years ago

When PATH appears, any source entry shall be ignored. (A warning could be issued...)

TRThurman commented 3 years ago

The formal specification for mapping syntax and semantics is SC4N2661. If there are inconsistencies between that document and the above extract of clause 5.1, they need to be brought to the attention of WG12. SC4N2661 Guidelines for the development of mapping specifications.doc.zip

TRThurman commented 3 years ago

I could not find this document: STONIS, Alfonsas; Mapping syntax extensions, ISO TC 184/SC4/QC N203, 2001-06-11.

TRThurman commented 3 years ago

I found this document. qcn163.htm.pdf

ronaldtse commented 3 years ago

Thank you @TRThurman !

ronaldtse commented 3 years ago

@TRThurman specifications from SC4N2661 and qcn163.htm.pdf should probably be published as a standard in order to formalise the mapping language.