JOSM / geojson

Allows reading GeoJSON using different projections – NOW PART OF JOSM CORE
Apache License 2.0
13 stars 10 forks source link

superposed nodes when importing Polygons/MultiPolygons and closed LineStrings, and area=yes/no OSM tags #12

Open verdy-p opened 7 years ago

verdy-p commented 7 years ago

GeoJSON does not detail nodes in the "coordinates" array embedded in geometries for Polygons (single area possibly with holes)/MultiPolygons(simple array of Polygons), or LineStrings (possibly closed)/MultiLineStrings(simple array of LineStrings).

The import tool currently creates a single new OSM node for each coordinate pair, including for the first and last [x,y] pair of every ring member of a Polygon (which are implicitly closed), or of any polygon member of a MultiPolygon, when they are the same (this is not a requirement for rings of Polygons/MultiPolygons as they are necessarily closed to represent surfaces, and if the last coordinates pair in a ring is not the same as the first one, the rings MUST still be closed as it it was there by as if the missing last pair was present).

Because coordinates pairs are compeltely untagged in GeoJSON geometries (except when they are in a feature whose geometry is only a single node), it makes no sense to not merge the duplicate nodes coming from the same geometry object (or the same "coordinates": properties of any imported "Feature" type).

Please merge these nodes, and make sure that ring members in Polygon (first ring for the outer border, 2nd and following rings for the optional inner borders) and MultiPolygons are effectively closed:

This closure is implicit in GeoJSON, where a triangular polygon like this one is still valid and closed:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2]]
]}

and equivalent to:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x0,y0]]
]}

which should still be imported as 3 OSM nodes (not 4) and a OSM closed way connecting the 3 nodes.

As well this tetragonal area with an inner triangular hole (sharing a common node between outer and inner rings):

[x3,y3]                       [x2,y2]
       +---------------------+
        \                   /
         \[x4,y4]  [x5,y5] /
          \   +---+       /
           \  |  /       /
            \ | /       /
             \|/       /
              +-------+
       [x0,y0]         [x1,y1]

represented in GeoJSON with this geometry:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x3,y3]],
  [[x4,y4], [x5,y5], [x0,y0]]
]}

or equivalently as:

{"type":"Polygon","coordinates":[
  [[x0,y0], [x1,y1], [x2,y2], [x3,y3], [x0,y0]],
  [[x4,y4], [x5,y5], [x0,y0], [x4,y4]]
]}

which should be imported as 6 OSM nodes (not 9 because [x0,y0] occurs 3 times, and [x4,y4] occurs 2 times), two OSM ways (one for each ring), and a OSM relation with "type=multipolygon" whose members will be the two rings, the first one with "outer" role, the second with "inner" role.

In contrast, the following "similar" tetragonal and triangular areas:

[x3,y3]                       [x2,y2]
       +---------------------+
        \                   /
         \                 /
          \               /
           \             /
            \           /
             \         /
              +-------+
       [x0,y0]|\       [x1,y1]
              | \
              |  \
              +---+
          [x4,y4]  [x5,y5]

represented in GeoJSON as:

{"type":"MultiPolygon","coordinates":[
  [
    [[x0,y0], [x1,y1], [x2,y2], [x3,y3]]
  ],[
    [[x5,y5], [x0,y0], [x4,y4]]
  ]
]}

or equivalently as:

{"type":"MultiPolygon","coordinates":[
  [
    [[x0,y0], [x1,y1], [x2,y2], [x3,y3], [x0,y0]]
  ],[
    [[x5,y5], [x0,y0], [x4,y4], [x5,y5]]
  ]
]}

will be imported as two closed ways, but no OSM "multipolygon" relation, unless they are part of a "Feature" with properties (in which case both closed ways will be members with "outer" role), and there will also be 6 OSM nodes only (not 9): the generated nodes should also be merged as they are part of the same geometry object (and necessarily have the same tags in the generated OSM feature, but OSM nodes themselves for this geometry will have no tags at all).

Notes: in strict GeoJSON, the outer rings should be traced anticlockwise and the inner rings should be clockwise. As well rings should not intersect except on isolated nodes with supplied coordinate pairs; outer rings should not overlap any non-zero surface; inner rings should be fully contained in the outer ring which is the first member of the polygon, and inner rings should not overlap themselves any non-zero surface. It is still valid (but not recommanded in GeoJSON) for a Multipolygon to have shared borders (but it is permitted if they are part of separate geometries in distinct Features with possibly separate GeoJSON properties, i.e. separate OSM tags). Such strict validation constraints do not need to be enforced when importing GeoJSON, but the JOSM validator may detect these. These constraints should be checked when exporting/saving from JOSM to a new GeoJSON, because GeoJSON applications may depend on these constraints for correct handling or rendering.


Also for Polygon/MultiPolygon, the imported OSM feature should also have tag "area=yes" as polygons in GeoJSON are necessarily representing surfaces, not just their border. This is different from GeoJSON "LineString/MultiLinestring", where the implicit OSM tag should be "area=no".

This would allow saving/exporting from JOSM to GeoJSON with the correct GeoJSON type: currently the imported Polygons/MultiPolygons will be saved incorrectly as "Linestring" or "MultiLineStrings" (the export does not figure out correctly that OSM closed ways or relations represent only borders or the surface, the surface interpretation should be assumed for OSM relations with "type=multipolygon" or "type=boundary", or for OSM closed ways with "landuse=" or "natural=" or "building=" or "water=", unless they have "area=no", or for any closed way with "area=yes").

When exporting to a geojson, the OSM tag "area=yes/no" should be discarded from the properties of the exported feature, it will just be used to select if a [Multi]LineString or [Multi]Polygon will be generated.

floscher commented 5 years ago

@verdy-p Thank you for raising this issue. It has been a while, but starting with https://github.com/JOSM/geojson/commit/16038bd599e2046d7e8cbc74450adb4a74751a3d , part of your points should be solved.

Polygons and Multipolygons will now always be imported as closed ways. LineStrings will only be imported as closed ways if first and last node have the same coordinate.

verdy-p commented 5 years ago

Ok, now there remains to solve the correct interpretation of imported rings, to determine if it generates simple OSM closed ways (rings), or create them as members of a single OSM multipolygon relation: this is not so simple, because you also need to determine their OSM role ("inner" or "outer") by checking which rings are enclosing others.

The import could fail if they intersect and the common surface is not 0% or 100% (such geometry is normally invalid also in GeoJSON).

But it should NEVER fail if the intersection between pairs of rings represents a surface of 0% or 100% of any ring, even if there are common/superposed nodes, or common/superposed line segments. If there are common segments, the import in OSM should try splitting them on identical nodes, and then remove duplicate segments, before trying to determine the new rings, and then their "inner" or "outer" roles.

The import would be safer if simply the GeoJSON was just parsed as a collection of independant segments, and then a loop will detect if they intersect somewhere else than just their two end-points: if there's such intersection, these segments would be split on the computed intersection points (non colinear case), or on the existing endpoints of the other segment.

Then a second pass will eliminate all pairs of segments (defined as unordered set of two points, which can be normalized by creating an ordered pair, where ordering is the point index in a separate map of points)

Basically:

To do that it will be useful to first a new PointSegmentsMap from a point number (in PointArray) to a vector of segment numbers (in SegmentArray); this will allow to detect points that have "multiplicities" (size of their vector) higher than 2 which are the only ones that require special checks:

Then create a new empty LinesArray (you don't know which polyline will be a closed ring before they are completed) and a new RingsArray. the by just scanning points in PointArray from top to bottom, taking and removing only one segment (e.g. the last one) from its associated PointSegmentsMap[n] vector and trying to connect it to the first or last node of existing lines in LinesArray.

If when you detect that the polyline is closed by the additional segment, remove the polyline from the set of Polylines, and add it to a separate RingsArray, otherwise add a point to the existing polyline.

At this step you may want to simplify the geometry of polylines and rings by merging segments that are colinear and in the same direction. Finally in the simplified array of rings, if this caused any ring to become completely "flat" (i.e. only two segments remaining in it), this ring can be eliminated from the set of rings.

At this step you have two sets: LinesArray and RingsArray. This is almost finished: one will become a set of OSM polylines (which can be put in a separate OSM Multipolygon relation with area=no, or just a single unclosed way if there's only one), the other will become a a single OSM Multipolygon relation with area=yes (or a single closed way, if there's only one).

But you need to determine the roles of each ring in the second OSM relation (area=yes): rings have to be first oriented (e.g. anticlockwise) so they become comparable (note: all these rings are such that each pair have now 0% or 100% surface intersection) : you need to sort the rings so that if ring A is covered at 100% by ring B, then A will be sorted after B, and pairs of rings that have 0% intersection are sorted in random order: you just need to create a map of ring numbers to other rings which are covering 100% of their surface (note: coverage does not take into account nodes or segments, only what is inside): you may just keep a counter of how many other rings are intersecting at 100%, because ONLY the even/odd parity of this number determines if this is an "outer" role (even counter) or "inner" (odd counter).

But you may also want to order these rings so that the outermost ones (with "inner" or outer" role in OSM) will always come before the others contained in them (with "inner" or "outer" roles): this ordering is equivalent to sorting the set of rings by decreasing counter values (instead of just keeping the even/odd info): the counter values are not unique (so this is a partial sort), but the relative order of rings with identical counters does not matter, as they cannot have one containing 100% of the other.

This algorithm can then be used to import arbitrary GeoJSON geometry(polylines, multipolylines, polygons, multipolygons), including invalid ones, it will generate at most:

You may choose to merge the two multipolygons into a single one without specifying the area="yes/no" attribute, given that ways are distringuished (by their closed or unclosed geometry, or by the role "inner" or "outer" assigned to those that are closed). In that case it will remain only 1 OSM parent object containing everything, with always a valid geometry and then you can map it as a single feature in OSM when it was a single Feature in the GeoJSON !

The same algorithm can be used to perform the reverse transform (from OSM to GeoJSON, but you need an additional step of ordering segments in rings in "anticlockwise" or "clockwise" or direction depending if they are determined to be respectively "outer" or "inner".

The same algorithm can also be used to transform an invalid GeoJSON geometry into a valid one, or an invalid OSM geometry into a valid one: the algorithm performs all the necessary corrections (using rules similar to those used in the SVG property "fill-rule: evenodd" for filling SVG paths)

The same algorithm can also be used in the OSM editor itself to simplify the geometry of any OSM relation containing ways with roles "outer" or "inner", or tagged as "area=yes", and reordering their members, then merging successive ways part of the reordered rings that have the same attributes.