osmcode / osmium-tool

Command line tool for working with OpenStreetMap data based on the Osmium library.
https://osmcode.org/osmium-tool/
GNU General Public License v3.0
483 stars 104 forks source link

Merged .poly files with overlapping regions leave empty island #266

Closed tordans closed 1 year ago

tordans commented 1 year ago

We have a folder of .poly files (Folder) that we cat-merge (Github Code) and use via osmium extract --overwrite --polygon=${MERGED_POLY_FILE} --output=${OSM_REGIONS} ${OSM_GERMANY}

Some of those shapes overlap.

We expected those overlaps to be part of our processed data. However, it looks like the overlap leaves empty islands of data.

Shape1 Shape2 Result(i~education~a~_F~s~!(i~default~a))(i~places~a~_F~s~!(i~default~a)(i~circle~a~_F))(i~buildings~a~_F~s~!(i~default~a))(i~landuse~as~!(i~default~a))(i~barriers~a~_F~s~!(i~default~a))(i~boundaries~a~_F~s~!(i~default~af~!(i~admin_level~o~!(i~7~a~_F)(i~8~a)))))(i~bikelanes~topics~!(i~bikelanes~as~!(i~default~a)(i~verification~a~_F)(i~completeness~a~_F))(i~bikelanesPresence~as~!(i~default~a))(i~places~as~!(i~default~a)(i~circle~a~_F))(i~landuse~as~!(i~default~a)))(i~roadClassification~topics~!(i~roadClassification~as~!(i~default~a))(i~bikelanes~as~!(i~default~a)(i~verification~a~_F)(i~completeness~a~_F))(i~maxspeed~as~!(i~default~a)(i~details~a~_F))(i~surfaceQuality~as~!(i~default~a)(i~bad~a~_F)(i~completeness~a~_F)(i~freshness~a~_F))(i~places~as~!(i~default~a)(i~circle~a~_F))(i~landuse~as~!(i~default~a)))(i~parking~topics~!(i~parking~as~!(i~default~a)(i~presence~a~_F)(i~surface~a~_F))(i~parkingPoints~a~_F~s~!(i~default~a))(i~parkingAreas~as~!(i~default~a)(i~position-separate~a~_F))(i~parkingDebug~a~_F~s~!(i~default~a))(i~parkingStats~a~_F~s~!(i~default~a~~f~!(i~admin_level~o~!(i~4~a~_F)(i~9~a)(i~10~a~_F)))(i~length~a~_F~f~!(i~admin_level~o~!(i~4~a~_F)(i~9~a)(i~*10~a~_F))))(i~landuse~a~_F~s~!(i~default~a)))(i~lit~topics~!(i~lit~a~~s~!(i~default~a)(i~completeness~a~_F)(i~verification~a~_F)(i~freshness~a~_F))(i~places~a~_F~s~!(i~default~a)(i~circle~a~_F))(i~landuse~a~_F~s~!(i~default~a)))~&mapDebug=true)
shape1 shape2 result

I am wondering…

Otherwise, it is no problem to change our pre-processing to first merge the areas and pass a merged file to osmium extract.

joto commented 1 year ago

The docs explicitly mentions this case: "If there are several (multi)polygons in a poly file or OSM file, they will be merged. The (multi)polygons must not overlap, otherwise the result is undefined." So you have to merge those polygons some way first.

tordans commented 1 year ago

Thanks @joto

So you have to merge those polygons some way first.

I worked around this for now with a custom node script that merges our gejsons first (https://github.com/FixMyBerlin/atlas-geo/commit/c8851beb4de1f8f125125f0023cd2471d2b426e6#diff-315f020a80b5e8cab732efede00d88a605d192eda912886c4ac4c9700f22f471). We will refactor this along the way, but maybe its useful for someone reading this…

The docs explicitly mentions this case: "If there are several (multi)polygons in a poly file or OSM file, they will be merged. The (multi)polygons must not overlap, otherwise the result is undefined."

I wanted to create a PR to change the wording to more explicitly state what "undefined" means in this case. When I was first debugging this, "undefined" did not signal "will not be processed" or "will leave this space empty". — Can you point me to where those docs are hosted, I did not find them

joto commented 1 year ago

The docs are in the repo: https://github.com/osmcode /osmium-tool/blob/master/man/osmium-extract.md

"undefined" is maybe not the best word here. For a C++ developer "undefined" has a very well defined meaning, it basically means "we are not promising anything if you do this". But not everybody is a C++ developer. :-)