geometalab / osmaxx

OpenStreetMap Arbitrary Excerpt Export - Get the OpenStreetMap data you want in the file format you need
http://osmaxx.hsr.ch/
MIT License
26 stars 3 forks source link

More data included in except than selected #904

Closed rwijtvliet closed 4 years ago

rwijtvliet commented 4 years ago

When ordering an excerpt and downloading it, I noticed there are elements included that are outside of the selected boundaries.

I've tested with the 'esri shapefile' export and both detail settings; the issue is present in all layers (pow, road, railways, water, poi) that I tested.

As a quick demo, I plotted the area around Hamburg, from which I had downloaded the except:

import matplotlib.pyplot as plt
import geopandas as gpd
fig, ax = plt.subplots()
gdf = gpd.read_file('hh_30kmaround_wgs-84_2020-03-31_shapefile_simplified.shp/road_l.shp')
ax.set_xlim(0, 20)
ax.set_ylim(45, 60)
gdf.plot(ax=ax, linewidth=1, color='b')

If you squint you may see the outline of the north sea coast in the top-left quandrant.

image

There are definitely many more shapes in the selected geographic area, but I had expected none outside this area. Is this a bug?

das-g commented 4 years ago

I believe you might seeing the effect of something OSMaxx does on purpose: Even if a feature only just intersects the chosen area, OSMaxx includes it completely.

This happens here for normal OSM data (see osmconvert documentation) and here for administrative boundaries.

OSMaxx does this so that geo data analysis can be done on the exported data without having to account for areas or lines maybe only representing a part of the feature they were originally. If you need data that is strictly limited to the chosen area, post-process it accordingly after downloading.

rwijtvliet commented 4 years ago

To be sure: it's not only lines and polygons (which might happen to cross through the geographic area) but also points that are included. (I'm unfamiliar with the commands you link to, but from a quick glance at the documentation seems to hint only at lines and polygons.)

If the 'features' you mention can mean something like "all the churches belonging to a certain regional organisition", then I can understand how points outside the boundaries still get included.

Let me know if you want me to prepare/upload some sample data

das-g commented 4 years ago

To be sure: it's not only lines and polygons (which might happen to cross through the geographic area) but also points that are included.

That is to be expected, as in the OpenStreetMap data model, lines (represented as "ways" or "relations" consisting of ways) and areas (represented as closed ways with specific tags or as specific relations) consist of "nodes". So to completely represent them, all their nodes have to be included. If some of these nodes happen to have OSM tags of their own, they will be considered point-typed features of their own by OSMaxx, even if those happen to be outside the chosen area.

If the 'features' you mention can mean something like "all the churches belonging to a certain regional organisition", then I can understand how points outside the boundaries still get included.

That would be strange, unless all these churches are in some huge multipolygon in OpenStreetMap or maybe in some other OSM relation. (I'm not quite sure how we handle relations in general.) What you can do is examine one of these points to get its OSM ID, look up the corresponding node in OpenStreetMap and see on the OpenStreetMap website, whether it's part of anything that overlaps with your chosen area. E.g., if the node's OSM ID was 1, the URL would be https://osm.org/node/1.

Let me know if you want me to prepare/upload some sample data

If that is data downloaded from OSMaxx, uploading it isn't needed: You can simply share the download URL that OSMaxx sent you.

I'll leave this issue closed for now. If there actually is a problem, we can re-open it again.