buildingSMART / NextGen-IFC

61 stars 4 forks source link

Normalize (combine) equal type instances to remove redundant information #30

Open HerbertDobernig opened 4 years ago

HerbertDobernig commented 4 years ago

Description of the proposal: Establish an implementation requirement to avoid redundant information in IFC containers (like IFC files) by normalization of equal type instances. Example: Currently exist implementations with a separate window-type definition for each window instance: n windows of same type T => n equal window-type definitions of same type T

Proposed implementation requirement: n windows of same type T => one window-type definition of type T

Describe how it contributes to the objectives set in https://github.com/buildingSMART/NextGen-IFC/wiki/Towards-a-technology-independent-IFC: Reduces processing time and document (file-) size in serialized data. Encourages normalization in databases of CDE and authoring-software thereby improving the efficiency.

What do we win: smaller document (file-) size reduced processing time

What do we loose redundancy that results in oversized documents (files)

Schema impact: none

Instance model impact: ?

Backwards compatible: YES

Automatic migration possible: NA

Additional implications:

- Note that not all points need to be satisfied! Backwards compatibility and file size are not concerns.

stefkeB commented 4 years ago

While I agree with this proposal, we have to be aware that this makes a bigger and more strict distinction between instances or occurrences and types (shared between the occurrences). There are properties and property sets that you may need at type level and others at the instance level, sometimes depending on the situation.

What this proposal implies is that you don't accept two types with equal information (values/attributes) if they have the same name.

Can the instance and the type use the same property set? And even have the same properties? IIRC, instance level values override type level values.

This is of course nothing that didn't exist before, but with this requirement, it pays to have a common agreed implementation between vendors.

klacol commented 4 years ago

This is of course nothing that didn't exist before, but with this requirement, it pays to have a common agreed implementation between vendors.

yes, a data template is needed here, that defines the propery sets that can be expected on type level and on instance level.

EAzari commented 4 years ago

@HerbertDobernig I like it and as you mentioned will reduce sizes and duplicate data/information

https://forums.buildingsmart.org/t/logistic-phases-and-time-in-ifc-definitions/2432/2?u=red_code

@klacol I think first of all you have to choose an "architecture" and based on developing the schema Which architecture? Domain-driven design (DDD)? Data-Oriented Design (DOD/A)? ...

I saw you talk about metadata and schema, and I think it's a good sign

HerbertDobernig commented 4 years ago

ADDENDUM

I described the issue incompletely. Therefore I round the description off by using an example:

Real buildings mostly have numerous windows out of a few designs. In the IFC model I expect to find one window design description (Representation, IfcWindowType, window design Properties) for each window design that is referenced by numerous windows of this design.

resonable: window design information is stored once and is referenced by n windows of same window design

unfavorable (but obviously implemented that way by a common authoring-software): window design information is repeated n times thereby establishing a 1-to-1 relationship between each window and its own copy of the design information. => IFC files become bloated (from 36kByte to 384kByte in a test-case with 64 windows) without providing additional information.

Proposed IFC implementation guideline: element design information := [Representation, IfcType, and design Properties of the element] that several elements have in common shall be present only once in the IFC file (NOT repeated).

klacol commented 4 years ago

Good description, In my words:

Every export of an IFC file shall contain type objects based on the objects in the model, even if the original model does not contain those type objects. The export shall group the objects based on all unique property values and create type objects on the fly and relate the grouped objects to this type objects.

We could think about a threshhold greater one grouped object, but for the sake of clarity, I would also generate those type objects even, if they have just one objects related.

berlotti commented 4 years ago

There seems to be two discussions here: @klacol has a strong focus on 'there should be a type object, even if the type object only has just one related object'. The original topic from @HerbertDobernig was 'when there are multiple typeObjects that are the same, they should be grouped and the multiple objects should refer to the one/single typeObject'

Both are good and valid implementer agreements (the standard probably already mandates the described use), and we should prioritize this a bit more during certification (although difficult to automatically check).

This issue is however an implementation issue. Please take into account what we are trying to do here: create a technology independent IFC, remove inconsistency in the schema, etc.
I'm sure you guys have contributions about linked lists, selects, etc..