Only allow specific well defined relation types as geometries in the Feature Classes on import

mboeringa commented 7 years ago

Hi @ThomasEmge ,

I have noticed with the latest build, that certain ill-defined relation types like:

type=cluster
type=multiobject
type=site
etc. (this list is very much longer, see http://wiki.openstreetmap.org/wiki/Types_of_relation)

actually can make their way into shapes / geometries in the Polygon, Polyline and Point feature classes.

I don't know if this is part of the "super-relation" processing, but allowing these to result in geometries, causes a number of issues:

Most of these relation types allow multiple "geometry types" to be part of the same relation (Polygon, Polyline and Point in GIS, but of course nodes, ways and multipolygons in OSM). This means a serious issue in interpretation, especially for OSM ways: where do the (nodes/)ways need to end up: in the Polygon / Polyline or Point table? The (cartographic) interpretation and representation of these relations is really problematic, although one solution with probably ambiguous results might be to direct the different members to the respective geometry table based on the interpretation of the feature as being a line or polygon.
Since relations kind of go "on-top" of the original nodes (closed) ways or multipolygons, they may duplicate tags that are already present on the relation members, causing multiple features / geometries with the same or similar tag sets to appear in the rendered map on top of each other.

My suggestion would be to allow only a small subset of well defined and established relations to make their way into the final output Feature Classes of the import tools - and especially the OSM File Loader (Load Only) tool - and to ignore / drop all other types. As far as I have understood it, osm2pgsql also only allows a very limited set of relation types to result in geometries in the database.

E.g. relations of the following types seem most appropriate for rendering, and usually have a rather clear interpretation and single geometry type to represent them with:

type=multipolygon should only result in features in the Polygon feature class. This seems to be OK now
type=route should end up in the Polyline (line) feature class. I think this is also OK now.
type=boundary should end up in the Polygon feature class, unless the relation members (that should always be ways), can not be assembled into a valid Polygon geometry, in which case they could probably be added to the Polyline feature class. I think the current behaviour is already close to this?? (to be honest, I never really had a look at the code of the Editor)
type=building relations should only have the relation member with role=outline - which can be a closed way or multipolygon - added to the Polygon feature class. We already worked on this issue and I think the current behaviour is OK.
type=waterway should end up in the Polyline (line) feature class. This seems OK.

@ThomasEmge , what are your ideas about this, and do you have any insights to share about how relations in general are now being interpreted, also related to the relation and super-relation processing I see happening in the output of the OSM File Loader (Load Only) tool?

ThomasEmge commented 7 years ago

I haven't looked to deeply into this issue. The current approach is that everything is translated into a geometry if possible. Some do get special treatment such as multipolygon or route or coastline, but that's about it. The fundamental challenge is the polymorphic behavior of relations. My initial suggestion would probably be to be inclusive for the types rather than exclusive. For the load only tool the 'type' tag is a default attribute, so you can filter out the problematic features at the time of rendering.

mboeringa commented 7 years ago

The current approach is that everything is translated into a geometry if possible. Some do get special treatment such as multipolygon or route or coastline, but that's about it.

Does that mean that if a relation consists of members consisting of nodes, some unclosed ways, and closed ways with area=yes tag or multipolygons, that in that case:

The member nodes end up in the Point feature class
The member unclosed ways in the Polyline feature class
The member closed ways with area=yes tag or multipolygons in the Polygon feature class?

And that all have the original relation ID set in the records of the respective feature class? I haven't actually properly checked up to now what happens with these kind of "polymorphic" relations...

The fundamental challenge is the polymorphic behavior of relations.

I agree, this is a major challenge

My initial suggestion would probably be to be inclusive for the types rather than exclusive. For the load only tool the 'type' tag is a default attribute, so you can filter out the problematic features at the time of rendering.

As another option: how difficult would it be for you to implement a simple checkbox option, something like

"Only convert well established relation types"

and that defaulting to the list I specified?

Or maybe even having an editable string parameter in the tool with this default shown, that users could edit to add their desired relation types, and then dropping any other relation types not specified?

This way, you would get the best of both worlds, while still allowing to weed out unwanted stuff at conversion time. The problem with having to do this in SQL afterwards is that the potential list of relation types is endless, and would make the already complex SQL even more unwieldy... and would require to add this statement to every single render rule in a style.

mboeringa commented 7 years ago

@ThomasEmge ,

Hmmm... my latest thoughts on this are that I possibly need to include the suggested functionality in my Renderer as well, as part of the pre-processing stage. I am thinking of making the list of allowed relation types a style property, so configurable using the Modify Style tool.

I will then use this information to exclude any non-conforming relation types in pre-processing using the type key, so the SQL of the style's render rules remains clean of these statements.

However, I am still a bit in doubt. It will mean another export / copy step of a feature class, as deleting the selected records is probably to expensive and / or slow on mega-large datasets. This will be yet another feature adding complexity to the render models.

In that respect, not creating them in the first place during conversion, is the most efficient solution.

Esri / arcgis-osm-editor

Only allow specific well defined relation types as geometries in the Feature Classes on import #164