osrm-contract - planet - foot profile

neilRGS commented 6 years ago

Hi. I am getting a persistent fail on the contract process, with no errors being logged.

The last few lines of the output are:

[info] Loading edge-expanded graph representation
[info] merged 2531742 edges out of 2043085298
[info] initializing node priorities... ok.
[info] preprocessing 498332441 (100%) nodes...
[info] .
 10% 
.
 20% 
.
 30% 
.
 40% 
.
 50% 
.
 60% 
.
[renumbered] 70% 
.
 80% 
.
 90% 
.

and that's where it ends.

This is the fourth time I have attempted this - doing osrm-extract, then osrm-contract each time. Each time has failed at what looks to be the same point.

The machine I am using (Ubuntu 16.04) has 512GB RAM and is a 16 core Xeon processor. The disk space is 3TB. So I am not short on resources for this.

I have managed successfully for the Bicycle profile and am about to kick off the car (driving profile).

However, foot is the most important for our client.

So, as far as I can see, the .hsgr and .timestamp files are the ones not being created.

Does anyone have any ideas why this might be happening?

For a little bit of extra info: The extract takes less than four hours and the contract fails after approximately 25 hours of running.

Many thanks,

Neil.

danpat commented 6 years ago

@neilRGS Does the process terminate there, or does it just appear to hang?

the last few % of osrm-contract always take the longest - and for foot, the last few nodes are the most difficult to calculate (I've been trying to think up a way to make the % linear with actual progress, but haven't come up with any decent ideas).

Other than that, can you check your system's /var/log/messages? If, for some reason, you're running out of memory, there might be something logged by the kernel describing why it killed the process.

neilRGS commented 6 years ago

Hi @danpat /var/log/messages was inactive, so I have activated it and will try another contract.

I looked in syslog and have seen this line: osrm-contract[40416]: segfault at 7f75fe286008 ip 00000000004a948d sp 00007f7f8dc5bb00 error 4 in osrm-contract[400000+128000]

I have noticed that the timestamp of the fault is right at the start of when I set the process running. I don't know if that gives any clues at all?

. (Re percentage progress, I wonder; if the number of items to go is known, perhaps a countdown of items remaining might be useful. The user would still not know exact time to completion would would have a rough idea, as they could see that each thousand items was taking an approximate amount of time. Just a thought, anyway :) ). .

danpat commented 6 years ago

@neilRGS hmm, a segfault is a bummer - unfortunately, without a core dump, I can't do much with the info in that log line 😞

What version of OSRM are you using?

neilRGS commented 6 years ago

@danpat Hi. I am using 5.18.0

I have made sure only necessary processes are running on the server, and am currently running another attempt.

I have checked that it does complete by running a foot profile with Ireland and Northern Ireland latest and they complete ok, so I can exclude the lua file from being an issue.

It is quite a head-scratcher.

Many thanks,

Neil.

neilRGS commented 6 years ago

@danpat

Would there be any mileage in compiling a debug version of the backend?

Would running ulimit -c unlimited prior to osrm-contract give the core dump without having to recompile the backend for debug? I'm assuming the core dump would be rather large due to the amount of RAM in use, but I do have plenty of disk space.

N.

neilRGS commented 5 years ago

Ok. ulimit didn't seem to make any difference.

I'm noting that the CPU usage seems inordinately high - some might say impossibly so: 1598%

Might that be a clue as to a problem?

Output from top:

top - 16:09:07 up  1:20,  1 user,  load average: 15.81, 15.31, 13.86
Tasks: 249 total,   2 running, 247 sleeping,   0 stopped,   0 zombie
%Cpu(s):100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 52834684+total, 38507052+free, 98598456 used, 44677840 buff/cache
KiB Swap:  2097148 total,  2097148 free,        0 used. 42728832+avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 7022 neilmcl+  20   0 94.379g 0.090t   5660 R  1598 18.3 559:57.59 osrm-contract

Best regards,

Neil.

danpat commented 5 years ago

@neilRGS >100% isn't unusual - 100% indicates all of 1 CPU core being used - a multi-threaded process can use multiple CPUs, and osrm-contract is one such process.

The work it does is very CPU intensive, so it tries to use all the CPUs that it can to speed up the task.

I'm not in the position at the moment to run a full process on the planet unfortunately. Ideally, you would:

Use a debug of OSRM to run osrm-contract - this will make it a lot slower (like 10x or more)
Enable core dumps with:
- ulimit -c unlimited
- sudo sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t
Run osrm-contract
When it crashes, save the /tmp/core-XXX, the original PBF file you used and the binaries you used.

With these things, it's possible to poke around inside the state of osrm-contract when it crashed using GDB or LLDB, and maybe that'll give us hints about the segfault.

neilRGS commented 5 years ago

Hi @danpat Typically, I am not able yet to do a full debug, as the machine needs to be used for serving bicycle and car profiles, so I have successfully compiled Europe-latest for the foot profile. That will keep our customer happy for now, especially as he can now 'bike' the planet.

For planet compiling (foot), would I be right in assuming that I cannot use a large swap and .stxxl file any longer? I was wondering if palming off the memory requirements to the SSD would help in any way? (reminder that OSRM version is 5.18.0)

Thanks Neil.

neilRGS commented 5 years ago

I have just had a thought. During a previous version compilation, earlier this year, I noticed that bridleways were not included in the foot profile lua file. This was brought to my attention when attempting to route a path through Snowdon, over a known bridleway, also known to be suitable for foot traffic. I altered the lua file to allow bridleways, exclude motorways and also changed the weight_name

Here are the lines I changed: Line 14: weight_name = 'distance', (changed from 'duration') Line 70: inserted exclusion:

--Exclude motorways
    excludable = Sequence{
        Set{"motorway"}
    },

Line 90: Inserted speeds sequence: bridleway = walking_speed,

I have noticed a couple of additional differences between the older lua file and the current one (not added by me): Line 36: in barrier_whitelist, 'liftgate' has been added

Line 71: this has been added:

    -- tags disallow access to in combination with highway=service
    service_access_tag_blacklist = Set { },

Line 252: This has been added:


    -- set weight properties of the way
    WayHandlers.weights
  }

  WayHandlers.run(profile, way, result, data, handlers)

I have attached the modified, new version foot.lua, together with the older version from June 2018

foot-June-2018.lua.txt foot-modified-v5.18.0.lua.txt

Might the modifications be causing so much more processing which might be a cause for the compilation to fail? As mentioned previously, the machine has 51GB RAM and TBs of storage. It is using SSD storage. OS is Ubuntu 16.04.5 LTS

Thanks Neil.

neilRGS commented 5 years ago

@danpat. Good morning. I have compiled osrm in debug mode and have run the extractor. I ran it with nohup as I was starting the process from an SSH session and didn't want it to terminate when the session closed.

In the output block below, it looks like it has failed. What are your thoughts on this?

{{1268 lines precede this block}}
...
[info] Sorting and writing 5 maneuver overrides...
[info] done.
[info] Renumbering turns
[info] Writing 0 conditional turn penalties...
[info] Generated 1212904812 edge based node segments
[info] Node-based graph contains 495035055 edges
[info] Edge-expanded graph ...
[info]   contains 1021542335 edges
[info] Timing statistics for edge-expanded graph:
[info] Renumbering edges: 83.427s
[info] Generating nodes: 977.719s
[info] Generating edges: 2169.69s
[info] Generating guidance turns 
[info] .
 10% 
.
 20% 
.
 30% 
.
 40% 
.
 50% 
.
 60% 
.
[assert][140444906276608] /home/neilmcleish/install/osrm/osrm-backend-5.18.0/src/guidance/turn_handler.cpp:243
in: osrm::guidance::Intersection osrm::guidance::TurnHandler::handleThreeWayTurn(EdgeID, osrm::guidance::Intersection) const: intersection[0].angle < 0.001
terminate called without an active exception

I'll look forward to hearing back from you. Many thanks,

Neil.

neilRGS commented 5 years ago

Hi. After a second attempt to extract, this time setting ulimit and a core dump file location, as below, I am still getting that error, but unfortunately, no dump file is being created.

Command: sudo sysctl -w kernel.core_pattern=/data/core-%e.%p.%h.%t && ulimit -c unlimited && nohup /usr/local/bin/osrm-extract planet-latest.osm.pbf -p ./profiles/foot.lua &

Last lines of output:

 60% 
.
[assert][139748853126912] /home/{{USER}}/install/osrm/osrm-backend-5.18.0/src/guidance/turn_handler.cpp:243
in: osrm::guidance::Intersection osrm::guidance::TurnHandler::handleThreeWayTurn(EdgeID, osrm::guidance::Intersection) const: intersection[0].angle < 0.001
terminate called without an active exception

I can post the entire log, if it might be useful?

Any ideas will be appreciated, as this is confusing me no end. We have a machine with 512GB RAM and 3TB storage.

Extraction and contraction worked for both bicycle and car, so I just cannot see why it refuses to work for foot.

Would there be any mileage in my compiling an older version of OSRM for this one profile, in case there is some form of storage type incompatibility?

Many thanks,

Neil.

ethanpooley commented 5 years ago

I'm getting the same initial symptom reported by neilRGS:

[info] Input file: ./planet-latest.osrm
[info] Threads: 64
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 2555192 edges out of 2120529422
[info] initializing node priorities... ok.
[info] preprocessing 517343754 (100%) nodes...
[info] .
 10% 
.
 20% 
.
 30% 
.
 40% 
.
 50% 
.
 60% 
.
[renumbered] 70% 
.
 80% 
.
 90% 
.

At the time of failure (not process start, as neilRGS reported) I get these messages from the system log:

[18757.797259] show_signal_msg: 3 callbacks suppressed
[18757.797263] osrm-contract[3814]: segfault at 7f7b014fa008 ip 00000000004acdbd sp 00007f88d0f0eb00 error 4 in osrm-contract[400000+12e000]

This is on an E64s_v3 at Azure (432GB RAM, 64 cores). The osrm-contract version is 5.22.0. Max memory usage is only about 1/3 of available, climbing steadily but slowly at the time of failure.

I'm going to try it with -t to leave a few threads free for system processes, plus verbosity level DEBUG.

ethanpooley commented 5 years ago

Same machine, osrm-backend v5.18.0 this time, with no backgrounding:

$ osrm-contract --threads=62 --verbosity=DEBUG ./planet-latest.osrm

[warn] The recommended number of threads is 64! This setting may have performance side-effects.
[info] Input file: ./planet-latest.osrm
[info] Threads: 62
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 2625316 edges out of 2120830740
[info] initializing node priorities... ok.
[info] preprocessing 517509436 (100%) nodes...
[info] . 10% . 20% . 30% . 40% . 50% . 60% .[renumbered] 70% . 80% . 90% .Segmentation fault (core dumped)

And from dmesg:

osrm-contract[7964]: segfault at 7f56d13b7008 ip 00000000004acbcd sp 00007f61ab4ecb00 error 4 in osrm-contract[400000+12d000]

By the calendar, my last successful build must have happened on v5.18.0 or later. So it's feeling like it's a change to the planet file rather than the application that is causing this.

ethanpooley commented 5 years ago

@danpat If you have the time and inclination this week or next to prosecute this issue via a core dump, I will try to produce that via your instructions to neilRGS above. And once I've done that I can just give you access to the whole server at Azure, rather than trying to copy the (likely massive) results around. It's a temporary box that exists for no other reason than to build this file for us. Just say the word.

Alternatively... if you have any stabs in the dark you want me to try, I'm game to make a couple more attempts. Changes to the foot profile? Another version of osrm-backend? Another verbosity setting (mine seems to have accomplished nothing)?

danpat commented 5 years ago

@ethanpooley Neil found one distinct problem:

[assert][139748853126912] /home/{{USER}}/install/osrm/osrm-backend-5.18.0/src/guidance/turn_handler.cpp:243
in: osrm::guidance::Intersection osrm::guidance::TurnHandler::handleThreeWayTurn(EdgeID, osrm::guidance::Intersection) const: intersection[0].angle < 0.001
terminate called without an active exception

which is likely a precursor to the segfault.

What would be useful is to combine this with better information on where the segfault occurs.

What would be helpful would be to build OSRM without assertions enabled, but with debug info:

cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo

then run osrm-contract via gdb or lldb, and capture the location of the segfault:

gdb --args ./osrm-contract yourfile.osrm

then when gdb catches the segfault, do bt full and put the output in the ticket here.

you should be able to re-use the datafiles you already created with osrm-extract, there's no need to re-run that step.

This will help understand if the assertion that @neilRGS hit is related to the segfault that happens later. Hopefully that is so and there aren't two problems :-)

ethanpooley commented 5 years ago

@danpat Here you go: planet-build-debug.txt

Since it mentions [assert] I'm not sure if I succeeded at building without assertions enabled. Would that have been an argument that you didn't explicitly list? My work is 100% interpreted languages, so my ignorance of compiling is near-total.

I can do it again with a different build if necessary. I also still have the debug session up, in case there's anything else you want me to ask of gdb.

ethanpooley commented 5 years ago

@danpat Can you let me know if that debugging output will be adequate, or if I'll need to recompile with an additional flag to get assertions turned off properly? This process just takes so much machine time, I'd like to get it started if that's necessary.

ethanpooley commented 5 years ago

@danpat Sorry to keep bumping this on you if you just aren't having the time to respond, but I'm soon going to get pulled off of this means of trying to update our walk times router and put onto figuring out what to do since we can't. So if I can do any more on this please let me know in the next 3-4 days if possible.

danpat commented 5 years ago

@ethanpooley I'm not going to have a tonne of time to look at this (we're not processing the whole planet this way so this hasn't hit us), but here's what I think's going on.

Digging through the debug log you posted, my hunch is that the foot profile with the whole planet is creating a graph with more edges than OSRM has space for - I think some element (either edge or node count) is exceeding 2^31 and overflowing a counter/storage field somewhere.

In the logs I can see:

[info] merged 2625316 edges out of 2120830740

where 2^31 - 2120830740 = 26,652,908. That's within 1% of of 2^31, which is close enough to make my spidey sense tingle.

There are a few spots in the code where we either steal 1 bit from an unsigned 32 bit integer for a boolean flag (reducing the number space from 2^32 down to 2^31), or use signed int types to keep track of things, which are +/- 2^31 on most platforms (and in C++, what happens if you overflow signed integers is undefined).

So while I have no conclusive evidence that this is the problem, it's my primary hunch based on the error, what you're doing (processing the whole planet with foot), and when it happens in the processing pipeline (fairly late into the CH graph construction, after many edges have been created). The foot/bike profiles create many more edges/nodes than the car profile, and I haven't created a whole-planet dataset like you're doing with those profiles for quite some time.

Tracking down where we're overflowing will be a bit tedious, and the fix will likely cause a sizable bump in memory consumption during preprocessing and possibly routing (would probably need to go from 4-byte numbers to 8-byte numbers in a few places).

If you need a path forward that doesn't involve tedious C++ integer overflow debugging, my suggestion would be to split the planet into pieces and serve them separately from different osrm-routed instances. This will involve putting a proxy of some kind in front of your API that knows how to redirect queries to the correct dataset based on the coordinates in the URL.

ethanpooley commented 5 years ago

Thank you @danpat, that motivates us to move forward with a multiple-region build without worrying that we're missing some simple fix.

bjtaylor1 commented 4 years ago

hi @ethanpooley , did you manage to get such a multi-region build implemented?

How do you call it? I'm curious as to whether you'd decide which region to call based on coordinates being within a polygon, or simply try them all? If the former, how do you generate the polygon that bounds the region?

ethanpooley commented 4 years ago

@bjtaylor1 We did.

We currently split it into two regions, roughly the eastern and western hemispheres. Routing across the boundary lines is supported by including an extra degree of longitude in each region. To cut OSM in half (with respect to its data footprint) with one cut at +/-180 degrees longitude, the other cut should be made at +8 degrees longitude. This may change over time.

We create east.poly and west.poly files like this:

    east
    eastern_hemisphere_with_buffer
         7   90
         7  -90
       180  -90
       180   90
         7   90
    END
    antimeridian_buffer
      -180   90
      -180  -90
      -179  -90
      -179   90
      -180   90
    END
    END

    west
    western_hemisphere_with_buffer
      -180   90
      -180  -90
         9  -90
         9   90
      -180   90
    END
    antimeridian_buffer
       179   90
       179  -90
       180  -90
       180   90
       179   90
    END
    END

We cut the planet file with something like this: nohup osmconvert -v --out-pbf /somepath/planet-latest.osm.pbf -B=east.poly -o=/somepath/east.osm.pbf >> east.log 2>&1 &

As for routing, I chose to do it in the Apache proxy that I already had set up in front of the machines running osrm-backend. In the proxy's host file I make entries for machines osrm-east and osrm-west. In the Apache site configs I do roughly this:

    # Choose which planet shard should receive this request.
    # Extract longitude from the URL
    SetEnvIf Request_URI "^.*/(-?[0-9]*)[.,][^/]*$" LON=$1
    # Greater than or equal to 8 degrees East longitude routes to osrm-east.
    SetEnvIfExpr "%{ENV:LON} -ge '8'" SHARD=east
    # Less than 8 degrees East longitude routes to osrm-west.
    SetEnvIfExpr "%{ENV:LON} -lt '8'" SHARD=west
    RewriteRule /(.*) http://osrm-%{ENV:SHARD}:5000/$1 [P]

ethanpooley commented 4 years ago

@bjtaylor1 Hope that helps. Oh, and if you have any tips on serving these hemispheres up via osrm-backend on less than about 200GB of RAM (and therefore hundreds of dollars per month, in the cloud), then please share!

danpat commented 4 years ago

Depending on your performance requirements, you could look at using the --mmap option to osrm-routed. Data will then be read from disk on-demand. If your request throughput is low, your routing requests are generally fairly local, and you have a reasonable amount of RAM to build up a helpful disk-read cache, you can probably save a bit of money.

If you need very fast routing and high throughput across the globe, then you're stuck I'm afraid.

Do make sure to pay attention to the actual memory footprint of osrm-routed after it boots - not all files generated during processing are used during routing, so it might be that 200GB is overkill (but again, I haven't generated planet-sized foot data in a while, so maybe it really is that big).

bjtaylor1 commented 4 years ago

hi @ethanpooley , yes that's very helpful thanks - I didn't think of splitting it simply into hemispheres, I was thinking of going europe, asia, america, etc. but your way has the obvious advantage of the proxy being able to decide which region to route to.

What web server is the proxy splitting configuration for - the only one I know how to use currently is nginx, would it work for that?

In my current implementation which I've had running for years (4 UK profiles on one t2.micro, and 1 europe profile on one t2.micro, modified version of cycling profile) I have probably got far less actual RAM than it's recommended to have but it works fine albeit a bit slow just with a large swap file. And these are on EBS not SSDs.

What I (probably) don't need, and what I also probably still won't need in the worldwide implementation I'm aiming to build, is massive throughput - i.e. I probably won't have >10 requests per second coming in. I probably won't ever have 1 request per second. What saving could I make given that? So for instance, could I get the same osrm-routed response time per resource by splitting the planet into more hemispheres, rather than just 2, say?

In terms of the extract I thought from this page: https://github.com/Project-OSRM/osrm-backend/wiki/Disk-and-Memory-Requirements that a server with that amount of memory would do it but it still seems to need a large swap file as well.

ethanpooley commented 4 years ago

@bjtaylor1

Those configs are for Apache. No clue what the NGINX translation would be—sorry!
Good to know that swapping and --mmap are workable options. I'll have to give that a try.
I see from that link that walking > cycling > car. And the numbers there are three years old now. So I'm not surprised at the experience I've had, RAM-wise.
I don't have any knowledge about performance vs. dataset size. I've just figured that the operations side of things is easier if there are fewer pieces.
Our usage is only around 0.2 requests per second. Average response time is ~70ms.

ethanpooley commented 2 years ago

Just an update: we now find that OSM splits more evenly at +11 longitude. So the above files now look like this:

    east
    eastern_hemisphere_with_buffer
        10   90
        10  -90
       180  -90
       180   90
        10   90
    END
    antimeridian_buffer
      -180   90
      -180  -90
      -179  -90
      -179   90
      -180   90
    END
    END

    west
    western_hemisphere_with_buffer
      -180   90
      -180  -90
        12  -90
        12   90
      -180   90
    END
    antimeridian_buffer
       179   90
       179  -90
       180  -90
       180   90
       179   90
    END
    END

    # Choose which planet shard should receive this request.
    # Extract longitude from the URL
    SetEnvIf Request_URI "^.*/(-?[0-9]*)[.,][^/]*$" LON=$1
    # Greater than or equal to 11 degrees East longitude routes to osrm-east.
    SetEnvIfExpr "%{ENV:LON} -ge '11'" SHARD=east
    # Less than 11 degrees East longitude routes to osrm-west.
    SetEnvIfExpr "%{ENV:LON} -lt '11'" SHARD=west
    # Use to view shard choice during testing.
    #RewriteRule .* %{ENV:SHARD} [R=301,L]

It's getting harder to fit the build process into an easy-to-get cloud VM. We'll probably move to a scripted build process and k8s hosting; by removing the manual build/hosting steps we can split it into smaller chunks without increasing the human interaction time required. I imagine we'll go to 4 shards just to be sure.

SiarheiFedartsou commented 4 months ago

Stale.

Project-OSRM / osrm-backend

osrm-contract - planet - foot profile #5231