Project-OSRM / osrm-backend

Open Source Routing Machine - C++ backend
http://map.project-osrm.org
BSD 2-Clause "Simplified" License
6.2k stars 3.29k forks source link

osrm-extract stalls with .pbf but works with .osm files #5277

Open glacierians opened 5 years ago

glacierians commented 5 years ago

Our Current Setup:

We are seeing a weird behavior with .pbf files. With osrm-extract, it appears to stall at this step:

[info] Parsed 0 location-dependent features with 0 GeoJSON polygons
[info] Using script ../osrm-backend/profiles/car.lua
[info] Input file: hawaii-latest.osm.pbf
[info] Profile: car.lua
[info] Threads: 8
[info] Parsing in progress..
[info] input file generated by osmium/1.8.0
[info] timestamp: 2018-11-19T21:14:02Z
[info] Using profile api version 4
[info] Found 3 turn restriction tags:
[info]   motorcar
[info]   motor_vehicle
[info]   vehicle
[info] Parse relations ... 

Unfortunately, we are constrained by our customer to use RHEL 6.6 which is a bit outdated. We followed the documentation on the Building-OSRM wiki page for RHEL, but it seems to stall on that same step. We also tried using larger instance sizes and waiting overnight for it to finish.

Our .stxxl configuration: disk=/tmp/stxxl,250G,syscall

We found a workaround by converting the .pbf into an .osm file through osmconvert, but we would like to understand why osrm-extract is not completing with .pbf files in case it is indicative of a larger issue with our setup.

Working Output:

[info] Parsed 0 location-dependent features with 0 GeoJSON polygons
[info] Using script ../osrm-backend/profiles/car.lua
[info] Input file: hawaii.osm
[info] Profile: car.lua
[info] Threads: 8
[info] Parsing in progress..
[info] input file generated by osmconvert 0.8.10
[info] timestamp: n/a
[info] Using profile api version 4
[info] Found 3 turn restriction tags:
[info]   motorcar
[info]   motor_vehicle
[info]   vehicle
[info] Parse relations ...
[info] Parse ways and nodes ...
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Using profile api version 4
[info] Parsing finished after 7.29016 seconds
[info] Raw input contains 1468874 nodes, 129149 ways, and 52 relations, 484 restrictions
[info] Sorting used nodes        ... ok, after 0.006128s
[info] Erasing duplicate nodes   ... ok, after 0.000931s
[info] Sorting all nodes         ... ok, after 0.000805s
[info] Building node id map      ... ok, after 0.00279s
[info] Confirming/Writing used nodes     ... ok, after 0.025773s
[info] Writing barrier nodes     ... ok, after 0s
[info] Writing traffic light nodes     ... ok, after 0s
[info] Processed 362011 nodes
[info] Sorting edges by start    ... ok, after 0.011632s
[info] Setting start coords      ... ok, after 0.024442s
[info] Sorting edges by target   ... ok, after 0.009378s
[info] Computing edge weights    ... ok, after 0.068843s
[info] Sorting edges by renumbered start ... ok, after 0.010865s
[info] Writing used edges       ... ok, after 0.014048s -- Processed 378674 edges
[info] Writing way meta-data     ... ok, after 0.000246s -- Metadata contains << 43459 entries.
[info] Sorting used ways         ... ok, after 0.000135s
[info] Collecting start/end information on 0 maneuver overrides...ok, after 0.000599s
[info] Collecting start/end information on 0 maneuver overrides...ok, after 0s
[info] Collecting start/end information on 484 restrictions...ok, after 0.001583s
[info] Collecting start/end information on 484 restrictions...ok, after 0.000351s
[info] writing street name index ... ok, after 0.000844s
[info] extraction finished after 7.47829s
[info] Generating edge-expanded graph representation

<truncated>

We might have missed a step, so any help would be appreciated.

Thank you

daniel-j-h commented 5 years ago

It looks like it's getting stuck at parsing relations. What you can do is

There may be an invalid relation in the pbf file and osmconvert maybe discards it.

danpat commented 5 years ago

@glacierians I can't reproduce the problem locally. Can you share the exact file that you used? Geofabrik updates their data regularly, so if it's a glitch in the actual .pbf file as @daniel-j-h suggests, I might not be using the right data to trigger the problem.

glacierians commented 5 years ago

Thanks for your responses. Since our initial approach worked, we didn’t dive too much into our issue with the .pbf files. However, we recently discovered that if we use .osm.bz2 files then we can skip the conversion from .pbf to .osm through osmconvert. When we used the .pbf file directly with osrm-extract, we observed abnormally low CPU (< 100%) and memory (< 1%) usages that indicated it didn’t take advantage of the multiple cores/threads on the instance. We also looked into using these osrm-extract flags: -t [ --threads ].

We just started working through some initial steps to process the planet file using the .osm.bz2 file. We are not exactly sure what is causing our issues with the .pbf files, but if the .osm.bz2 planet file works, then we have a good alternative approach.

I tried this Hawaii .pbf from Geofrabik earlier today with our setup: https://download.geofabrik.de/north-america/us/hawaii-181216.osm.pbf. It still produced the same “stalled” results.

glacierians commented 5 years ago

We also tried .pbf files of different areas within the US and even North America. Those .pbf files did not work until we converted it to .osm.

jasonwiener commented 4 years ago

@glacierians this helped us a ton, thanks so much for posting!

github-actions[bot] commented 1 week ago

This issue seems to be stale. It will be closed in 30 days if no further activity occurs.