citygml4j / citygml-tools

Collection of tools for processing CityGML files
Apache License 2.0
115 stars 18 forks source link

Conversion "BuildingInstallation" from GML to JSON #7

Closed hugoledoux closed 4 years ago

hugoledoux commented 5 years ago

I downloaded a small part of Rotterdam-3D (https://www.dropbox.com/s/mcf5jen1hper0nh/9300_4400.gml?dl=0) and if I convert with citygml-tools then the CityJSON file contains BuildingInstallations that have an empty geometry, while in CityGML there is a geometry.

See eg gml:id=UUID_d3aa9299-874b-405c-98c1-60f852445365

Actually, in CityJSON the geometry of the Installation is in the building itself, and the geometry of the Installation is empty.

A non-documented feature or a bug?

clausnagel commented 5 years ago

Bug :-) Thanks for reporting.

clausnagel commented 5 years ago

Interesting example. What would you expect the CityJSON output to look like?

The surfaces of the BuildingInstallations are referenced from the gml:Solid of the corresponding Building feature in CityGML. Thus, in CityJSON they must also be part of the solid geometry of the building itself, otherwise the solid would have holes and thus would be invalid. And since they are semantic surfaces, they are listed in the "semantics" property of the building geometry.

According to your issue, the surfaces should also appear as separate multisurface geometry of the CityJSON "BuildingInstallation" object. And since they are still semantic surfaces, I assume that this geometry also must have a "semantics" property listing the surfaces.

This means that the surfaces and the "semantics" properties of the building geometry and of the installation geometry are partly redundant. I have created an example of how I think the CityJSON result should look like for one of the CityGML buildings: ID_0599100000223738.zip

Do you agree with this, or would you say that the redundant information must be avoided (if so, how?).

clausnagel commented 5 years ago

I loaded the above CityJSON file into the CityJSON web viewer. It seems that the redundant surfaces in the geometry of the building and of the building installation are rendered twice. Am I right with this? I think this is rather unwanted?

hugoledoux commented 5 years ago

I had never thought about this, interesting example indeed.

One question: if the BI was an antenna (thus not formed of semantic surfaces) then your code would put those geometries in the BI in CityJSON, right?

Just a few thoughts because I don't know what we should do here, just thinking out loud:

Your thoughts?

balazsdukai commented 5 years ago

Very interesting issue. initially I've written a long argument on dormers not being BuildingInstallations and thus creating a new semantic class for them, but the more I think about this, the more I believe that in the current case, where the BuildingInstallation is a dormer, the CityJSON should:

The union of the two yields a valid Solid in the current case. However, in case the BuildingInstallation is not a structural part of the building (removing it doesn't create a hole), then their union won't be a valid Solid (eg. airco units). Thus difficult to validate the citymodel with BuildingInstallations. But I think this is a different discussion.

And to generalize, I would always exclude the geometry of the BuildingInstallation from the geometry of its parent.

clausnagel commented 5 years ago

I had quite similar first thoughts for a general CityJSON representation. I would just be a bit more strict regarding solids:

So, if a dormer should be an identifiable object, it shall be modelled as "BuildingInstallation" and then must follow the above rules. Alternatively, if you just want to represent the dormer geometrically, then it is only modelld as part of the "Building" geometry and there is no additional "BuildingInstallation". I would not introduce an additional semantic "DormerSurface".

The exact same modelling is possible (but not enforced) in CityGML 1.0 / 2.0. I can adapt citygml-tools so that CityGML data structured this way can be losslessly converted into CityJSON and back. This would be my preferred default case.

Now, the above Rotterdam data does not follow this modelling, but yet is a valid CityGML representation with valid solid geometries. I could think of the following mapping:

Well, I do prefer the first option. Yes, the resulting CityJSON would lack the closed volume geometry from the original CityGML file, but it would be valid.

hugoledoux commented 5 years ago

I've been thinking about this, and I like neither of your 2 options, sorry.

I know that these are CityGML-compliant, but both reduce the usability of the data in my opinion. (I agree that having a ClosureSurface would solve it elegantly, but as you point out it's complex to crunch the data and out-of-scope for citygml-tools).

I would argue that having solids (thus watertight) for buildings is far more important in practice than being able to count the number of dormers. Perhaps I am wrong, but who and why does one need to count dormers? While having a watertight solid allows us to (1) calculate volumes, (2) define clearly what the interior/exterior is (and thus convert easier to other formats).

If we labelled the surfaces as DormerSurface then we could, not that simply, still be able to count the number of dormers.

So I would actually favour the status quo: dormers are part of the roof.

Also, notice that identifying where a dormer "starts" and/or taking it "apart" from the main building to create a BuildingInstallation can be tricky in practice, see these 2 cases:

clausnagel commented 5 years ago

Ok, I agree that identifying dormers in practice can be tricky. I would leave it to the user to decide based on his or her requirements and applications whether dormers should be represented as identifiable objects in the model. So I would suggest to skip this more general discussion here.

Regarding the data conversion: Yes, I absolutely agree with the advantage of having a watertight building geometry. My last post was just trying to identify general requirements for modelling parent and child objects, and what that would mean for converting the dataset in question.

So, based on the current result of citygml-tools, the empty BuildingInstallation could be simply deleted from the final output. Et voilá, the result is a wateright building geometry with classified roof and wall surfaces.

The information about the dormer would then be missing. DormerSurface might help, but I have open questions:

I am really open to this. What would be your preferred representation?

And one more thought. In my first post I attached a CityJSON file where the dormer is represented as BuildingInstallation and the building geometry is watertight. This solves all above questions, because the dormer geometry can still be classified as roof and walls, and attributes can be assigned to both the BuildingInstallation and its semantic surfaces. The only drawback of this file is that the surfaces of the dormer are redundantly represented for the BuildingInstallation and the Building.

clausnagel commented 5 years ago

btw, sorry for the lengthy posts :-)

balazsdukai commented 5 years ago

Yes, I think we agree that watertight geometry is a priority. I would also add that having a data model/format that is easy to parse is at least as important. These two properties allow the users to at least have data set that they can read. Then its up to them to develop the code that identify the dormers or do any other analysis in this data. That's why I'm very confident about adding any Dormer* semantic. Because if we go down that road, then why not include all of the other architectural elements? So I prefer the more generic BuildingInstallation, in which a the data creator can define in an attribute the object that the BuildingInstallation represents.

Following, in the case of the example dataset I wouldn't specify any DormerSurface, because it's not in the original data set either. Although we identified that the installation is a dormer in fact by looking at model, but this is not possible to automate.

Then in a "minimal" alternative, I would keep the Bulding object as it is in the example that you sent @clausnagel, where the BuildingInstallation is part of the Building geometry with all its semantics, and remove the BuildingInstallation object completely. Then there is no redundancy, but we would loose information. However it leads to data set that is "simpler" to process, and delegates the responsibility of identifying the dormer to the end user/developer who wants to find the dormers on the roofs.

hugoledoux commented 5 years ago

Yeah, okay we/I are/am convoluting what should be the ideal case (should there be dormers as separate objects?) and the particular case of citygml-tools here. Let's focus solely on the latter.

I don't have strong feelings because nothing is perfect.

I agree that we don't know it's a dormer so impossible in this case to label surfaces as a dormer. My point was that in an ideal case, we should add semantics surfaces to the Building, and indeed DormerRoofSurface and DormerWallSurface would be smarter than just a DormerSurface. But here it's not relevant.

I would actually drop the empty BI and just keep the Building, and not have any redundancy in this case. But the small dataset you gave above is also perfectly fine to distribute, but I think it would create a bit more confusion in practice. That's all.

And we discussed having a list of surfaces for a CityJSON file. Smart in theory, but in practice I reckon it's a mess when one updates the file. I wrote the cjio merge() and subset() operators, and if I had had to select and update and delete vertices and surfaces IDs I'd have suffered a bit more. So at this moment I favour simplicity over "richness of representation", but who knows, maybe I could change my mind 😆

clausnagel commented 4 years ago

Sorry for being unresponsive. I have worked on this issue in the meantime though, and decided to fix it in the following way:

  1. The default behaviour of the to-cityjson command is to map all CityGML top-level and child objects to corresponding CityJSON objects. Geometries and semantic surfaces will also be created for child objects even if this results in duplicate geometries.
  2. The user can choose to avoid duplicate geometries for child objects with the new --remove-duplicate-child-geometries option. In case a child object has an empty geometry after removing duplicates, it will be removed from the CityJSON output.

Thus, the small dataset I provided as example above is now the default result. It is as close as possible to the original CityGML dataset as it still contains both the building and the building installation as well as the watertight volume geometry of the building.

If a user favours non-redundant geometry over mapping all features, then the --remove-duplicate-child-geometries option is the solution.

The bugfix is available in the latest version 1.3.2.