osmcode / osmium-tool

Command line tool for working with OpenStreetMap data based on the Osmium library.
https://osmcode.org/osmium-tool/
GNU General Public License v3.0
509 stars 107 forks source link

Missing relations when filtering all administrative boundaries #248

Closed arredond closed 2 years ago

arredond commented 2 years ago

What version of osmium-tool are you using?

osmium version 1.14.0 libosmium version 2.18.0 Supported PBF compression types: none zlib lz4

What operating system version are you using?

macOS Monterey version 12.3 MacBook Air M1 2020

Tell us something about your system

16 GB RAM, 8 CPUs, bare metal

What did you do exactly?

I'm trying to get all administrative boundaries of all countries in the world while avoiding having to set up and load into a PG database. To do so, I've downloaded the latest Daylight Planet OSM release (v1.14) and have tried to filter:

osmium tags-filter planet_daylight_v114.osm.pbf wra/boundary=administrative -o planet_boundary_administrative_v114.osm.pbf

When checking the actual data, I see that some relations like Ceuta (1154756) are missing:

osmium getid planet_boundary_administrative_v114.osm.pbf r1154756 -o ceuta.osm.pbf

The resulting PBF is empty (73B) whereas, when extracting the same relation directly from the full Planet.osm release, it's there (908B).

What did you expect to happen?

I expected the relation to be in the filtered PBF file since it has the boundary=administrative tag and is present in the original PBF.

What did happen instead?

Relation r1154756 is not found.

What did you do to try analyzing the problem?

Described above

joto commented 2 years ago

The command line you show is mission a -o before the output file. The command line you are showing should not even generate the planet_boundary_administrative_v114.osm.pbf file, because it would interpret that as a tag pattern.

Also: Please try with a smaller file (for instance the ceuta.osm.pbf you generated) and see whether it still doesn't work. That would give us a much easier case to reproduce.

arredond commented 2 years ago

Sorry, that was just a typo when copying. I think I can reproduce with a Geofabrik export of Spain which should be more manageable.

arredond commented 2 years ago

This is interesting. I've tested with both the full Spain Geofabrik PBF and just the Ceuta region and I can't see the issue in either of them.

wget https://download.geofabrik.de/europe/spain-latest.osm.pbf
osmium getid spain-latest.osm.pbf r1154756 -o spain_ceuta_no_filter.osm.pbf
osmium tags-filter spain-latest.osm.pbf wra/boundary=administrative -o spain_boundaries.osm.pbf
osmium getid spain_boundaries.osm.pbf r1154756 -o spain_ceuta_filtered.osm.pbf
diff spain_ceuta_no_filter.osm.pbf spain_ceuta_filtered.osm.pbf # No difference

It would seem that there's some reference problem in the Daylight Planet release but if I check the references I can't see any issue:

osmium check-refs planet_boundary_administrative_v114.osm.pbf

There are 10420517 nodes, 200578 ways, and 7042 relations in this file.
Nodes in ways missing: 0
ImreSamu commented 2 years ago

@arredond :

r1154756 osmium check-refs planet_boundary_administrative_v114.osm.pbf

add an extra options : --check-relations

-r, --check-relations
     Also check referential integrity of relations. Without this option, only nodes in ways are checked.

https://docs.osmcode.org/osmium/latest/osmium-check-refs.html

arredond commented 2 years ago

@ImreSamu thanks, I wasn't aware of that extra option. That did the trick:

osmium check-refs -r planet_boundary_administrative_v114.osm.pbf

There are 10420517 nodes, 200578 ways, and 7042 relations in this file.
Nodes     in ways      missing: 0
Nodes     in relations missing: 0
Ways      in relations missing: 0
Relations in relations missing: 288

I suppose the question is: why am I able to extract the relation from the original Daylight PBF?

joto commented 2 years ago

The relations check shouldn't have anything to do with your original problem. For osmium tags-filter It shouldn't matter if the relation is complete or not, only if it is there at all. Are you sure the Ceuta relation is actually in the original file?

arredond commented 2 years ago

So I double checked just in case and... you're absolutely right, the Ceuta relation isn't in the original file. It turns out Daylight focuses on street level data but includes patches to add admin boundaries, and I first used a patched version (unknowingly) but was then trying on a "pure" version.

Sorry for making noise, closing this issue