Remove "parent" property?

clausnagel commented 5 years ago

2nd-level CityObjects like "BuildingInstallation" offer a "parent" property of type string which is the ID of the parent CityObject. It is assumed that every 2nd-level CityObject has exactly one parent, and this is fine for the CityJSON core objects.

But what about extensions? The EnergyADE, for instance, defines a feature type called Construction to define the structure, material and energy-related properties of walls or roofs. As many walls in a building are of the same type, their Construction only needs to be defined once in a dataset and then can be reused by the different wall objects. This also helps to keep the file size small.

So, when mapped to a CityJSON extension, "Construction" would be an example of a CityObject having multiple parents. A simple string would not be sufficient anymore. I see three possibilities to solve this:

Keep the restriction that a CityObject can have at most one parent. As a consequence, CityObjects like "Construction" cannot be reused. Instead, every wall would have to reference its own "Construction" copy as child even if all these copies would have identical properties and values. This would also increase file size.
Redefine "parent" to be an array of strings to be able to reference the IDs of multiple parents.
Completely remove the "parent" property from all CityObjects. As a consequence, the bi-directional associations between CityObjects would become one-directional (from the parent to the child via the "children" property).

I propose option 3. When reading a CityJSON dataset into main memory, it is really simple to internally create pointers from a child to its parent(s) based on the "children" property in case an application really needs to navigate associations in both ways. Moreover, having both "children" and "parent" as properties in the dataset increases the possibility of inconsistencies between them.

hugoledoux commented 5 years ago

The issue I see is that we identify a 2nd-level CityObject if it has the property "parent"... it's pretty useful when scanning a file to count number of CO (eg Buildings): you just count those not having a "parent" property (and thus BuildingPart not counted as a CO). Of course one can compute these rather easily, but I still think it's elegant and simple.

What we could do is have "parent" be an array. That would solve the problem with Extensions, and it would also harmonise the property with that of "children". Now one is array the other is string, but both arrays of strings would mean less error-prone maybe.

clausnagel commented 5 years ago

Assume an extension that defines a new CityObject called "Parcel" that has "Building" objects located on the parcel as "children". This extension might also add a "parent" extension property to "Building" so that each "Building" references its parcel. In a corresponding dataset, both "Building" and "BuildingPart" would thus have a "parent" property, and you could not just count those not having a "parent" property.

hugoledoux commented 5 years ago

okay, I had a plan... and you bombed that plan! But you're totally right, I hadn't thought of this...

I still think that parent should be an array (thus it becomes parents) because it's useful for like cjio and in QGIS. Say a user selects with the mouse a BuildingPart, and wants to export that, you would want the whole Building to be exported (we flatten the schema, but the user shouldn't be a victim of this), and not 'split' city objects apart, right? So parents can help us there.

I still think it's useful to know which city objects can "exists-by-themselves", like a Building can, even if it has a Parcel parent, but a BuildingInstallation cannot. So we could just have a property called something intelligent for these (top-object? or something like this). Not for the core city objects, but the ones defined in ADEs.

clausnagel commented 5 years ago

+1 for making parent an array and for renaming it to parents. I think this is the best way to move forward.

In the CityGML 3 development, we also recognize the need to tag top-level feature types. Just as you wrote in your post, one of the reasons is to be able to identify top-level ADE features in software. In a first draft, this was implemented in the conceptual model by introducing another abstract super class, so that a city object could, for instance, be derived from AbstractCityObjectType and AbstractTopLevelFeatureType to denote that it is top-level. However, this led to multiple inheritance. So the current draft uses a UML stereotype instead to tag top-level feature types. This stereotype could be easily mapped to an attribute in an encoding. So your proposal is very inline with this discussion.

hugoledoux commented 5 years ago

@clausnagel I updated the specs, the datasets, and the Extension description. "parents" is now used, and for Extensions it is mandatory to put a property "toplevel": true/false for each new City Object. I didn't enforce this for the objects of the core model, these can be hardcoded (cjio did this for instance).

If you agree then I'll release v0.9, ok?

clausnagel commented 5 years ago

Sounds great. Thanks.

cityjson / specs

Remove "parent" property? #38