buildingSMART / NextGen-IFC

61 stars 4 forks source link

Not use EXPRESS ARRAYs #21

Open pipauwel opened 4 years ago

pipauwel commented 4 years ago

There are very few ARRAYs defined in IFC. SETs and LISTs are much more common. Is it an option to replace all ARRAYs with LISTs?

This seems an easy fix, and it makes translation to other languages easier.

In particular:

TYPE IfcComplexNumber = ARRAY [1:2] OF REAL;
END_TYPE;
hlg commented 4 years ago

No, it is not possible to simply replace ARRAYs with LISTs, because LISTs in EXPRESS can not have unset values, while ARRAYs can. Instead, IfcComplexNumber might be modelled like this:

ENTITY IfcComplexNumber;
    RealPart: OPTIONAL REAL;
    ImaginaryPart: OPTIONAL REAL;
END_ENTITY;

A similar construction would work for IfcMaterialLayerWithOffsets.OffsetValues with optional lower and upper offset values.

Alternatively, if there is a sensible default, like 0 for the REALS in IfcComplexNumber, then you could use a fixed length or length-constraint LIST. But for IfcMaterialLayerWithOffsets.OffsetValues that would not be as simple.

Longer and variable length arrays such as BSpline control points need a completely different consideration.

aothms commented 4 years ago

To my understanding ARRAYS are not by default with optional elements.

The spec says (emphasis mine)

An array data type definition may optionally specify that an array value cannot contain duplicate elements. It may also specify that an array value need not contain an element at every index position.

The formal def is:

165 array_type = ARRAY bound_spec OF [ OPTIONAL ] [ UNIQUE ] base_type .

So in my understanding only ARRAY [1:2] OF OPTIONAL REAL; would result in optional elements.

hlg commented 4 years ago

Upon rereading the spec, yes, you are totally right. I also misinterpreted the bounds to denote minimum and maximum cardinality instead of indizes. Then yes, for non-optional arrays, a fixed size list could be a possible replacement such as LIST[2] OF REAL in this case.

It still remains to check whether there are potential issues with indexes in any rules or functions, since ARRAY[1:2] would assume indizes 1 or 2.

pipauwel commented 4 years ago

I am aware that they are all four slightly different, bags, sets, lists, and sets. Fact is that many other languages don´t go in that diversity in expressiveness. So, in practice, they are anyway mapped to different constructs, often to lists. That seems to be also the common case in the IFC spec in general.

So, in an effort to remove the more exotic choices in the spec, and reduce parsing complexity, I would argue for trying to follow your suggestion:

a fixed size list could be a possible replacement such as LIST[2] OF REAL in this case.

There are a number of arrays used in rules and DERIVE constructs, I think -not many-. I would argue for an effort (project, one day activity, branch in github, or anything) to try to get these arrays out of the spec.

TLiebich commented 4 years ago

lets be specific here - the three diffences of arrays, compared to lists, are 1) the bounds are not min/max (in terms of members), but min/max in terms of indices 2) there are sparse arrays when OPTIONAL is used 3) there are arrays prohibiting duplicates when UNIQUE is used

using LIST instead does cover 1) [minor task, check if functions need to be changed] and 3) LIST can also be UNIQUE. This leaves only 2) open, as lists cannot be "sparse".

Now lets check IFC (latest official IFC4.2). We have:

TYPE IfcComplexNumber = ARRAY [1:2] OF REAL;
END_TYPE;

ENTITY IfcMaterialLayerWithOffsets
 SUBTYPE OF (IfcMaterialLayer);
    OffsetDirection : IfcLayerSetDirectionEnum;
    OffsetValues : ARRAY [1:2] OF IfcLengthMeasure;
END_ENTITY;

ENTITY IfcMaterialProfileWithOffsets
 SUBTYPE OF (IfcMaterialProfile);
    OffsetValues : ARRAY [1:2] OF IfcLengthMeasure;
END_ENTITY;

it could easily changed to (with only minor backward compatibility problems) into:

TYPE IfcComplexNumber = LIST [2:2] OF REAL;
END_TYPE;

ENTITY IfcMaterialLayerWithOffsets
 SUBTYPE OF (IfcMaterialLayer);
    OffsetDirection : IfcLayerSetDirectionEnum;
    LowerOffsetValue : OPTIONAL IfcLengthMeasure;
    UpperOffsetValue : OPTIONAL IfcLengthMeasure;
 WHERE
        MinOneOffset: EXISTS(LowerOffsetValue) OR EXISTS(UpperOffsetValue);
END_ENTITY;

ENTITY IfcMaterialProfileWithOffsets
 SUBTYPE OF (IfcMaterialProfile);
    LowerOffsetValue : OPTIONAL IfcLengthMeasure;
    UpperOffsetValue : OPTIONAL IfcLengthMeasure;
 WHERE
        MinOneOffset: EXISTS(LowerOffsetValue) OR EXISTS(UpperOffsetValue);
END_ENTITY;

alternative is to only change ARRAY to LIST also here (minor impact, less explicit)

ENTITY IfcMaterialLayerWithOffsets
 SUBTYPE OF (IfcMaterialLayer);
    OffsetDirection : IfcLayerSetDirectionEnum;
    OffsetValues : LIST [2:2] OF IfcLengthMeasure;
END_ENTITY;

ENTITY IfcMaterialProfileWithOffsets
 SUBTYPE OF (IfcMaterialProfile);
    OffsetValues : LIST [2:2] OF IfcLengthMeasure;
END_ENTITY;

plus some usage in derived (not directly exchanged) attributes and functions for b-spline curves and surfaces. This might need to be checked in detail, but would not be a show stopper.

TLiebich commented 4 years ago

check list

what do we win

what do we loose

schema impact

instance model impact

backwards compatible

automatic migration possible

TLiebich commented 4 years ago

I created a new label "validated proposal" meaning, that this issue has been thoughougly checked and is seen as a valid candidate to be included into upcoming version of IFC (sort of "pre-authorized" pull request).