buildingSMART / NextGen-IFC

61 stars 4 forks source link

IDs for all #26

Closed pipauwel closed 4 years ago

pipauwel commented 4 years ago

At the moment, only the Entities underneath the IfcRoot Entity class have a GlobalId or Identifier. Not everything is a subclass of IfcRoot, however. As a result, several entities and types do not carry an identifier. It may be a good idea to allow / require IDs for every IFC concept (equal right for identity).

This would be helpful for translations to XML and JSON in particular, because they need such identifiers to be able to make references across the tree. Example:

      "Class": "Door",
      "GlobalId": "f3b96025-a1f3-42a8-b047-b6cc5b1880ff",
      "Name": "'A common door'",
      "Description": "'Description of a standard door'",
      "Representation": {
        "Class": "ProductDefinitionShape",
        "Representations": [
            {
                "Class": "ShapeRepresentation",
                "ref": "dc12a77c-c560-45e3-af0f-e84f5afbe844"
            },
            {
                "Class": "ShapeRepresentation",
                "ref": "29820bc5-4ddb-4ce4-a65a-e9c936997256"
            }
        ]
      }
pipauwel commented 4 years ago

This would also greatly help translation to RDF graphs, in which basically anything except a literal (XSD datatypes mostly) requires a URI. Instead of us having to compute UUIDs on the fly, based on other neighbouring UUIDs, we could use IDs that are available for any IFC concept other than the basic ones (strings, literals, sets, lists, and so on).

hlg commented 4 years ago

In order to cross-reference entities in a tree-like or flat representation in a defined context (be it document, database, computer memory), there is no need for globally unique identifiers. It is sufficient to have uniqueness local to the context.

In current IFC (and models in other domains as well) I see a clear and reasonable semantic distinction between entity types that can exist on their own without being referenced from anywhere else in the context (hence the name "root") and types that can only exist as resources referenced from within the context (hence everything in the "resource layer"). The former have global identity, the latter only local.

The requirement for GUIDs everywhere seems very specific to the semantic web. For every other implementation method (SPF, XML, JSON ...) it seems not necessary. Do we really need and want to force a globally unique ID on every single IfcCartesianPoint(0.,0.,0.)?

pipauwel commented 4 years ago

In order to cross-reference entities in a tree-like or flat representation in a defined context (be it document, database, computer memory), there is no need for globally unique identifiers. It is sufficient to have uniqueness local to the context. True. I always argued -as one of the only ones- for such local identifiers in the semantic web world, simply using inst:Wall_47 and similar. The 'inst:' prefix (namespace) + line number ensures local uniqueness. The vast majority in the web prefers a GUID in the URI, however. I don't mind either way really for the semantic web case, which is less relevant in this thread. They will manage.

But... for XML and JSON... IfcProductDefinitionShape does not have a GUID. So how would you suggest to do the above example then? With a cocktail of line number references and GUIDs?

      "Class": "Door",
      "STEPFileLineNumber": "i1873"
      "GlobalId": "f3b96025-a1f3-42a8-b047-b6cc5b1880ff",
      "Name": "'A common door'",
      "Description": "'Description of a standard door'",
      "Representation": {
        "Class": "ProductDefinitionShape",
        "Representations": [
            {
                "Class": "ShapeRepresentation",
                "href": "i2087"
            },
            {
                "Class": "ShapeRepresentation",
                "href": "i2089"
            }
        ]
      }

Or, encapsulating all geometry within the tree directly and prohibiting referencing geometry defined elsewhere?

There is also this example in XML using line numbers instead of GUIDs everywhere. I think this is current Revit XML. Internal line number ID referencing mechanism used everywhere; GUID as an unused attribute.

        <IfcApplication id="i1641">
            <ApplicationDeveloper>
                <IfcOrganization xsi:nil="true" ref="i1637"/>
            </ApplicationDeveloper>
            <Version>2019</Version>
            <ApplicationFullName>Autodesk Revit 2019 (ENU)</ApplicationFullName>
            <ApplicationIdentifier>Revit</ApplicationIdentifier>
        </IfcApplication>
        <IfcCartesianPoint id="i1642">
            <Coordinates exp:cType="list">
                <IfcLengthMeasure exp:pos="0">0.</IfcLengthMeasure>
                <IfcLengthMeasure exp:pos="1">0.</IfcLengthMeasure>
                <IfcLengthMeasure exp:pos="2">0.</IfcLengthMeasure>
            </Coordinates>
        </IfcCartesianPoint>
        <IfcCartesianPoint id="i1645">
            <Coordinates exp:cType="list">
                <IfcLengthMeasure exp:pos="0">0.</IfcLengthMeasure>
                <IfcLengthMeasure exp:pos="1">0.</IfcLengthMeasure>
            </Coordinates>
        </IfcCartesianPoint>
berlotti commented 4 years ago

IfcProductDefinitionShape is a resource. Which of the resources would you say should be in the IfcRoot inheritance tree (because they need a GUID in the use-cases)?

pipauwel commented 4 years ago

A resource, yes. But without a GUID :

PDS

I don´t have a clear list with additional resources that could really use a UUID, but well, this IfcProductDefinitionShape is one. I am sure that more can be found.

I am quite fine with not adding a UUID everywhere, but the alternative, namely using line numbers as secondary indexes, is not really an option, in my opinion. Especially when moving to a transactional world, in which ´line numbers´ and ´files´ don´t even exist. So, the following should then not be allowed:

<IfcOrganization xsi:nil="true" ref="i1637">
<IfcCartesianPoint id="i1645">
...
hlg commented 4 years ago

The numbers after the leading hashes of entity instances in SPF files are not actually line numbers. They are "entity instance names". If I am not mistaken, whitespace (including line breaks) is not syntactically significant in SPF. You can have no line breaks at all or an entity instance running over several lines. It is just convention to have each instance on one line and to use sequential numbers for the names.

In fact, these entity instance names are local identifiers which any other implementation method can implement differently, for instance as id attribute in XML. They are just for internal reference, while GlobalIds (GUIDs, UUIDs) are reserved for those entities which need to be referenced from external. Internal and external are relative to the context, which may be a file or something else.

pipauwel commented 4 years ago

Okay. Sounds quite clear.

XML, JSON, RDF: local internal identifiers (i234 and similar) + GUIDs for those references that need to be referenced from outside.