mvexel / overpass-api-python-wrapper

Python bindings for the OpenStreetMap Overpass API
Apache License 2.0
359 stars 89 forks source link

Multipolygons are breaking geojson convertion #135

Open Monstrofil opened 3 years ago

Monstrofil commented 3 years ago

When trying to fetch multipolygons (e.g. 11038555), I get error:

Received corrupt data from Overpass (incomplete polygon).

It seems that happens because outer and inner members are not always closed: they may be just ways.

Are you planning to improve this?

Monstrofil commented 3 years ago

UPD: I did this patch for me, it works, but it definitely need to be slightly improved before merge:

Index: overpass/api.py
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- overpass/api.py (revision cd544c59c3546d442df0b5b16853ec555dd34def)
+++ overpass/api.py (date 1612186223887)
@@ -9,7 +9,10 @@
 import geojson
 import logging
 from datetime import datetime
-from shapely.geometry import Polygon, Point
+
+from geojson import MultiLineString
+from shapely.geometry import Polygon, Point, mapping, LineString
+from shapely import geometry as geom, ops
 from io import StringIO
 from .errors import (
     OverpassSyntaxError,
@@ -222,40 +225,43 @@
             elif elem_type == "relation":
                 # Initialize polygon list
                 polygons = []
+                outer_ways = []
+                inner_ways = []
                 # First obtain the outer polygons
                 for member in elem.get("members", []):
-                    if member["role"] == "outer":
-                        points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
-                        # Check that the outer polygon is complete
-                        if points and points[-1] == points[0]:
-                            polygons.append([points])
-                        else:
-                            raise UnknownOverpassError("Received corrupt data from Overpass (incomplete polygon).")
-                # Then get the inner polygons
-                for member in elem.get("members", []):
-                    if member["role"] == "inner":
-                        points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
-                        # Check that the inner polygon is complete
-                        if points and points[-1] == points[0]:
-                            # We need to check to which outer polygon the inner polygon belongs
-                            point = Point(points[0])
-                            check = False
-                            for poly in polygons:
-                                polygon = Polygon(poly[0])
-                                if polygon.contains(point):
-                                    poly.append(points)
-                                    check = True
-                                    break
-                            if not check:
-                                raise UnknownOverpassError("Received corrupt data from Overpass (inner polygon cannot "
-                                                           "be matched to outer polygon).")
-                        else:
-                            raise UnknownOverpassError("Received corrupt data from Overpass (incomplete polygon).")
-                # Finally create MultiPolygon geometry
-                if polygons:
-                    geometry = geojson.MultiPolygon(polygons)
-            else:
-                raise UnknownOverpassError("Received corrupt data from Overpass (invalid element).")
+                    if member["role"] not in ["outer", "inner"]:
+                        continue
+                    points = [(coords["lon"], coords["lat"]) for coords in member.get("geometry", [])]
+                    if member["role"] == "outer":
+                        outer_ways.append(geom.LineString(points))
+                    elif member["role"] == "inner":
+                        inner_ways.append(geom.LineString(points))
+                outer_merged_line = ops.linemerge(outer_ways)
+                if isinstance(outer_merged_line, LineString):
+                    iterator = [outer_merged_line]
+                else:
+                    iterator = outer_merged_line.geoms
+                for line in iterator:
+                    polygons.append(Polygon(line.coords))
+
+                inner_merged_line = ops.linemerge(inner_ways)
+                if isinstance(inner_merged_line, LineString):
+                    iterator = [inner_merged_line]
+                else:
+                    iterator = inner_merged_line.geoms
+                for line in iterator:
+                    inner_poly = Polygon(line.coords)
+
+                    tmp_polygons = []
+                    for poly in polygons[:]:
+                        if poly.contains(inner_poly):
+                            poly -= inner_poly
+                        tmp_polygons.append(poly)
+                    polygons = tmp_polygons
+
+                all_polygons = [mapping(poly)['coordinates'] for poly in polygons]
+                if all_polygons:
+                    geometry = geojson.MultiPolygon(all_polygons)

             if geometry:
                 feature = geojson.Feature(
mvexel commented 3 years ago

Thanks for spotting and fixing!

dericke commented 3 years ago

I'm wondering if it might be best to import another lib for converting OSM format to geojson and let this lib focus on the API stuff, rather than maintain a homegrown converter. Something like osm2geojson, perhaps?

dericke commented 3 years ago

I had a quick look at replacing the built-in geojson method with osm2geojson, found that osm2geojson uses 7 decimal points of precision for coordinates, while this library has been using 6 (default from the geojson library) and osm2geojson nests the id property of a feature under properties, whereas the built-in method puts it in the top level of the feature.

mvexel commented 2 years ago

I think the best way to go about supporting GeoJSON properly is not to roll our own but instead support __geo_interface__. Since geojson implements this interface too, there's no need to introduce any other dependencies.

mvexel commented 4 months ago

This is fixed on the main branch now with #140 merged in. Looking to get 0.7.1 out soon with this fix.