opennewzealand / linz2osm

Some tools for helping move LINZ data into OpenStreetMap
http://wiki.openstreetmap.org/wiki/LINZ
GNU General Public License v3.0
22 stars 2 forks source link

Polygon/Line Splitting behaviour #80

Open rcoup opened 12 years ago

rcoup commented 12 years ago

Large Polygons and Multipolygons are currently implemented as:

MultiLineStrings/GeometryCollections are yielded as separate features by the looks of it (but I doubt there are any)

Changes:

Polygons > 990 nodes, Polygons with more than one ring, and MultiPolygons should be implemented as per: http://wiki.openstreetmap.org/wiki/Relation:multipolygon

Lines with >990 nodes should be implemented as multiple distinct ways, sharing start/end nodes for each segment.

MultiLineStrings should be implemented as per: http://wiki.openstreetmap.org/wiki/Relation:multilinestring

rcoup commented 12 years ago

Previous changes were under https://github.com/opennewzealand/linz2osm/issues/10

There was no relation:multilinestring then, and the renderers didn't deal with relation:multipolygon properly (ie. they only looked at the ways, so they were untagged and didn't get rendered)

rcoup commented 12 years ago

Existing tests in https://github.com/opennewzealand/linz2osm/blob/master/linz2osm/convert/tests/osm.py need to be updated to the new behaviour as well.

HamishB commented 12 years ago

The smaller limit was added to allow a bit of room for "helpful" users to join adjoining ways together, as they are prone to do, without exceeding the 2000 nodes-per-way. When that happens it is rather ugly to clean up (been there-done that during the coastline updates, it was an ugly job I don't ever want to have to do again). Many of the nodes (up to the 2k limit) remain but lose their attachment to any way or relation, and so you end up with a broken area or way and many orphaned nodes.

The news of multilinestring support in OSM is new to me, it seems a bit odd that such things would be added without a formal change. Then again, I'm not aware of any formal process by which these things get added or formally exist other than someone making a wiki page for them in an authoritative voice, some number of people deciding to use it, and the renderers actually respecting it (aka good luck it you're not using their copy of mapnik to work with the data). Is relation:multilinestring actually formally "approved" by a technical committee in any way, and in fact non-experimental? The only other reference to it I could find just mentioned it as "semi-proposed". It would use it in place of other relation tagging with extreme caution; it may just be someone's random idea.

put the tags on the relation, not the ways

as noted in email, that's quite popular in osm, but a mistreatment of the data model. The relation has to do with relating the topology of objects and feature IDs, it exists at a different level of abstraction than things on the ground, it's domain is relating the database features. The ways themselves exist on the ground in lat/lon space, and attribute tags exist on that same level of abstraction. The previous compromise around this issue was to tag both the relation and the way, which leads to a bit of double handling for edits (not so good) but doesn't leave orphaned & untagged ways if the relation is damaged by a clumsy or unskilled 3rd party user (thus is more robust).

The bytes saved by creating fewer ways and only tagging the relation are on the whole inconsequential; since this is not part of the API, there is no right answer to any of these questions, just last-person-to-edit-wins on the wiki pages for advice. I would argue to support the lowest common denominator & thus the widest group of rendering and processing tools, and to do what ever we can to keep the tagging <-> feature connections as robust as possible.

thanks, Hamish

rcoup commented 12 years ago

The smaller limit was added to allow a bit of room for "helpful" users to join adjoining ways together

Yup, hence 990 rather than 1990 :) Also avoids creating relations...

There's no multilinestrings in the current data, so that part is a bit academic.

put the tags on the relation, not the ways

as noted in email, that's quite popular in osm, but a mistreatment of the data model.... The previous compromise around this issue was to tag both the relation and the way,

The docs pretty clearly state to tag the relation and not the ways for:

Things seem to have improved quite significantly since we reviewed this in 2010. It's been flagged on the imports list as a problem. And the renderers all seem to support it correctly (which they didn't in 2010).

I don't see how it's a mistreatment of the data model. Ways are limited to 2000 nodes, and a series of ways need to be related as a single "feature on the ground" -- the feature in this case is the relation, not a part of a ring.

HamishB commented 12 years ago

The docs pretty clearly state to tag the relation and not the ways

"suggest". My point is that this is actually a poor suggestion.

for reference: http://wiki.openstreetmap.org/wiki/Relation:multipolygon#Tagging http://wiki.openstreetmap.org/wiki/Relation:multipolygon/Algorithm

There's no multilinestrings in the current data, so that part is a bit academic.

all implementation variants we've been discussing (splits below 2k, multipolys, tagging..) are legal and understood by all the known editors and renderers, so all of it's completely academic... heck, I never let that stop me though. :-)

It's been flagged on the imports list as a problem.

From my reading of Paul's email he hadn't noticed that the relation was also tagged, and thought it was an unintentional oversight on our part.

shrug, I'm not losing sleep over this, just want to make sure we don't paint ourselves into a corner wrt later extracts and orphaned data by following a bad (albeit popular) suggestion. I spent way too much time manually fixing the Abel Tasman coastline and park boundary the first time, I never want to have to do that sort of thing ever again if it could be avoided (by for example splitting ways more and over-tagging smaller objects). Intense focus on creating the strongest foundations at the outset, that's all. Since we can't practicably fix it later better to discuss such concerns now.

best, Hamish

ps- a couple of months back I was reading through the old type-written spec for MOSS GIS from the late 70s and the Fort Hood GIS spec (later to become GRASS) from the early 80s. It struck me that one of their big goals for version "1.1" back then was to get rid of max polyline node counts very similar in magnitude to those OSM has today. Here's to hoping that version 0.7+ of the OSM API gets with the times and makes all of these decades-long-solved issues moot.

barnaclebarnes commented 12 years ago

all implementation variants we've been discussing (splits below 2k, multipolys, tagging..) are legal and understood by all > the known editors and renderers, so all of it's completely academic... heck, I never let that stop me though. :-)

Not true.

If people get these validation warnings in JOSM they won't know what to do.

rcoup commented 12 years ago

From my reading of Paul's email he hadn't noticed that the relation was also tagged, and thought it was an unintentional oversight on our part.

Currently we're not tagging the multipolygon relation, only the component ways.

@barnaclebarnes: if we tag both the relation and the ways, does JOSM still complain about it?

rcoup commented 12 years ago

ugh. Nah. Inner rings are deliberately not tagged... so only outer/partial rings would be tagged. Seems like a bad idea.

HamishB commented 12 years ago

Glen:

Then JOSM Validation gives a warning about a non-closed way

that's a known (to JOSM devs) false-positive in the validator. see my email from last week sometime when I saw that and investigated it.

Rob:

Currently we're not tagging the multipolygon relation, only the component ways.

are you sure? see http://www.openstreetmap.org/browse/relation/903456 for example

ugh. Nah. Inner rings are deliberately not tagged... so only outer/partial rings would be tagged.

inner holes and "islands" (or lakes if you prefer) within polygones are/were not tagged (since they are filled with nothing), while outer relations are/were tagged (since filled with whatever the tag says).

See Glen's George Sound native bush example from email in JOSM for example.

Hamish

rcoup commented 12 years ago

are you sure? see http://www.openstreetmap.org/browse/relation/903456 for example

hmm. So reading https://github.com/opennewzealand/linz2osm/blob/master/linz2osm/convert/osm.py#L275 properly disagrees with me.

IgnoreRob

@HamishB can we file a bug with JOSM? Reference it here? And get it fixed?

In the meantime I guess we need to document this so people expect it.

So the question then becomes:

The previous compromise around this issue was to tag both the relation and the way, which leads to a bit of double handling for edits (not so good) but doesn't leave orphaned & untagged ways if the relation is damaged by a clumsy or unskilled 3rd party user (thus is more robust).

That vs the docs/wiki, which fairly clearly have said (across multiple edits over the last year, from contributors whose names I know) that "apply all tags which describe the area to the relation, and not to the ways"

Is orphaned & untagged ways that big a problem in 2012? Unconnected ways that aren't closed won't be drawn either...

HamishB commented 12 years ago

perhaps the thing to do is whatever works for the current generation (to be honest I'm not too fussed, as long as it works everywhere without being lossy), but with an eye towards what might be coming in the future?

see the random-collection-of-ideas at http://wiki.openstreetmap.org/wiki/API_v0.7#Areas

Hamish

HamishB commented 12 years ago

rcoup:

Is orphaned & untagged ways that big a problem in 2012?

They are as vulnerable to clumsy edits as they ever were, but the question is what and if we should try to do anything about it...