Project-OSRM / osrm-backend

Open Source Routing Machine - C++ backend
http://map.project-osrm.org
BSD 2-Clause "Simplified" License
6.35k stars 3.36k forks source link

Out of Memory : osrm-contract : OSRM v5.16 #5305

Closed neilRGS closed 3 months ago

neilRGS commented 5 years ago

Hi. I am attempting to contract planet-latest from an extraction with the foot profile. OSRM v5.18 kept giving segment failures, so I have switched to osrm v5.16.

This time, the following errors are showing up in dmesg: Attempt 1 (with swap of 750GB): [ 9832.162117] Out of memory: Kill process 11095 (osrm-contract) score 992 or sacrifice child [ 9832.162435] Killed process 11095 (osrm-contract) total-vm:525726512kB, anon-rss:523435252kB, file-rss:1768kB

Note in the attached image (With-swap-post-fail.png) the swap file was of adequate size and did not fill up. The machine had approx 500GB RAM available as well. There should be no reason at all for an OOM message, yet there was. with-swap-post-fail

Attempt 2 (removing swap altogether, as it really shouldn't be needed): [ 9832.163417] Out of memory: Kill process 11855 (osrm-contract) score 992 or sacrifice child The htop image attached (OSRM-contract-no-swap.png) shows no swap and that RAM is not being filled. However, at some point in the process, something ran away with itself, the machine became unresponsive, then the OS decided to sacrifice the osrm-contract process. No doubt to prevent full lockout. osrm-contract-no-swap

For additional info, this is the device structure:

udev                          252G     0  252G   0% /dev
tmpfs                          51G  5.9M   51G   1% /run
/dev/mapper/vglocal00-root00  268G  222G   46G  84% /
tmpfs                         252G   12K  252G   1% /dev/shm
tmpfs                         5.0M     0  5.0M   0% /run/lock
tmpfs                         252G     0  252G   0% /sys/fs/cgroup
/dev/sda1                     976M  110M  851M  12% /boot
/dev/mapper/vglocal00-tmp00   2.0G  6.1M  1.9G   1% /tmp
/dev/mapper/vglocal01-data00  3.6T  1.6T  2.0T  45% /data
tmpfs                          51G     0   51G   0% /run/user/1002
tmpfs                          51G     0   51G   0% /run/user/1000

I am also sending a copy of this ticket to the host co (Rackspace) to get their take on it and see if the hardware configuration might be causing issues, as I find it hard to believe that there is a software fault across the two separate versions of osrm.

If anyone has any ideas as to what might be wrong, please do throw your ideas at me.

Best regards,

Neil.

danpat commented 5 years ago

@neilRGS You always find the tricky ones, don't you...

Many of the changes we've made recently don't affect osrm-contract, so it's quite possible that it's the same bug.

One suggestion - try running osrm-contract --threads 4. osrm-contract will by default launch a worker thread for each CPU core it detects. This makes processing faster, but each worker also needs a bunch of RAM to function in. Reducing the number of worker threads will slow down processing, but also reduce the total memory footprint needed.

If you still get a crash with 500GB of RAM and only 4 workers, then we've likely got a bug somewhere. If it works, the it's possible that the foot.lua profile simply needs a crapload of RAM when run with 16 CPU cores.

Another option might be to re-build OSRM and re-enable STXXL - do cmake -DENABLE_STXXL=ON when you build. STXXL uses an efficient disk-paging approach and requires less RAM overall. It won't be as fast as RAM-only, but it might get you over the hurdle.

neilRGS commented 5 years ago

@DanPat I like to keep people busy!! ;)

Thanks for the info and your time. I will try your suggestions and will post back the results.

All the best,

Neil.

On 15/12/2018 05:21, Daniel Patterson wrote:

@neilRGShttps://github.com/neilRGS You always find the tricky ones, don't you...

Many of the changes we've made recently don't affect osrm-contract, so it's quite possible that it's the same bug.

One suggestion - try running osrm-contract --threads 4. osrm-contract will by default launch a worker thread for each CPU core it detects. This makes processing faster, but each worker also needs a bunch of RAM to function in. Reducing the number of worker threads will slow down processing, but also reduce the total memory footprint needed.

If you still get a crash with 500GB of RAM and only 4 workers, then we've likely got a bug somewhere. If it works, the it's possible that the foot.lua profile simply needs a crapload of RAM when run with 16 CPU cores.

Another option might be to re-build OSRM and re-enable STXXL - do cmake -DENABLE_STXXL=ON when you build. STXXL uses an efficient disk-paging approach and requires less RAM overall. It won't be as fast as RAM-only, but it might get you over the hurdle.

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Project-OSRM/osrm-backend/issues/5305#issuecomment-447538678, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AL4e0-DIkCKbdKF18AoSK7oBCYoJRED-ks5u5IaigaJpZM4ZM-ly.

neilRGS commented 5 years ago

@danpat : Happy New Year!

Unfortunately, trying using only 4 threads crashed out. I was using OSRM 5.16.

[info] Input file: planet-latest.osrm
[info] Threads: 4
[info] Reading node weights.
[info] Done reading node weights.
[info] Loading edge-expanded graph representation
[info] merged 2590844 edges out of 2084538704
[info] initializing node priorities... ok.
[info] preprocessing 508285483 (100%) nodes...
[info] .
 10% 
.
 20% 
.
 30% 
.
 40% 
.
 50% 
.
 60% 
.

Tomorrow, I will try by rebuilding OSRM with STXXL and will have another go.

neilRGS commented 5 years ago

Good morning @danpat .

osrm-extract for the planet, (using stxxl) succeeded.

Some interesting info regarding RAM and stxxl use:

RAM: peak bytes used: 312160919552
STXXL: total disk allocated: 750000000000
STXXL: peak bytes used: 7295991808

I am about to run contract now, so we'll see how that goes.

All the best,

Neil.

neilRGS commented 5 years ago

Hi @danpat

Unfortunately, that failed again witht eh OOM rearing its head, as before. Grepping dmesg for osrm shows this:

[23121.366131] osrm-contract invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0
[23121.366135] osrm-contract cpuset=/ mems_allowed=0
[23121.366140] CPU: 7 PID: 32074 Comm: osrm-contract Not tainted 4.4.0-141-generic #167-Ubuntu
[23121.366428] [32074]  1002 32074 131023681 130854546  255611     505        0             0 osrm-contract
[23121.366431] Out of memory: Kill process 32074 (osrm-contract) score 992 or sacrifice child
[23121.366704] Killed process 32074 (osrm-contract) total-vm:524094724kB, anon-rss:523416164kB, file-rss:2020kB
[208811.641253] osrm-contract[8906]: segfault at 7f05ddeef008 ip 00000000004aa86d sp 00007ffda4797860 error 4 in osrm-contract[400000+129000]

Rackspace came back to me regarding the OOM issue, when I asked them to check, on the provious occasion, and the op there had this to say (actually, it was good of them to take so much of an interest):

He checked the syslog as root and found this (the very last line is interesting):

---- extract ----
root@{anonymised}:/usr/local/bin# grep -E -C4 'oom[_-]*kill|Killed|hung|panic|readonly|orphan|Trace|memory' /var/log/* | grep osrm

/var/log/kern.log:Dec 10 12:47:37 970891-OSRM kernel: [1027006.807047] Out of memory: Kill process 40348 (osrm-contract) score 819 or sacrifice child
/var/log/kern.log:Dec 10 12:47:37 970891-OSRM kernel: [1027006.807391] Killed process 40348 (osrm-contract) total-vm:1077360364kB, anon-rss:522092668kB, file-rss:1900kB
/var/log/kern.log:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161569] osrm-contract invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0
/var/log/kern.log-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161581] osrm-contract cpuset=/ mems_allowed=0
/var/log/kern.log-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161587] CPU: 1 PID: 11095 Comm: osrm-contract Not tainted 4.4.0-140-generic #166-Ubuntu
/var/log/kern.log-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162104] [11095]  1002 11095 131431628 130859255  255652     506        0             0 osrm-contract
/var/log/kern.log:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162117] Out of memory: Kill process 11095 (osrm-contract) score 992 or sacrifice child
/var/log/kern.log:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162435] Killed process 11095 (osrm-contract) total-vm:525726512kB, anon-rss:523435252kB, file-rss:1768kB
grep: /var/log/lxd: Is a directory
/var/log/kern.log:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.163417] Out of memory: Kill process 11855 (osrm-contract) score 992 or sacrifice child
grep: /var/log/mysql: Is a directory
grep: /var/log/nginx: Is a directory
grep: /var/log/recap: Is a directory
grep: /var/log/rs-mbu: Is a directory
/var/log/syslog.1:Dec 10 12:47:37 970891-OSRM kernel: [1027006.807047] Out of memory: Kill process 40348 (osrm-contract) score 819 or sacrifice child
/var/log/syslog.1:Dec 10 12:47:37 970891-OSRM kernel: [1027006.807391] Killed process 40348 (osrm-contract) total-vm:1077360364kB, anon-rss:522092668kB, file-rss:1900kB
/var/log/syslog.1:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161569] osrm-contract invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0
/var/log/syslog.1-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161581] osrm-contract cpuset=/ mems_allowed=0
/var/log/syslog.1-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.161587] CPU: 1 PID: 11095 Comm: osrm-contract Not tainted 4.4.0-140-generic #166-Ubuntu
/var/log/syslog.1-Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162104] [11095]  1002 11095 131431628 130859255  255652     506        0             0 osrm-contract
grep: /var/log/sysstat: Is a directory
grep: /var/log/unattended-upgrades: Is a directory
/var/log/syslog.1:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162117] Out of memory: Kill process 11095 (osrm-contract) score 992 or sacrifice child
/var/log/syslog.1:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.162435] Killed process 11095 (osrm-contract) total-vm:525726512kB, anon-rss:523435252kB, file-rss:1768kB
/var/log/syslog.1:Dec 10 16:45:53 970891-OSRM kernel: [ 9832.163417] Out of memory: Kill process 11855 (osrm-contract) score 992 or sacrifice child
root@{anonymised}:/usr/local/bin# 

# which osrm-contract
/usr/local/bin/osrm-contract

root@{anonymised}:/usr/local/bin# file osrm-contract
osrm-contract: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=73a9626d05cf3225f5f3f655b711c0740ea4d759, not stripped

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

I found similar issues in the following forums:

https://github.com/Project-OSRM/osrm-backend/issues/4580
https://github.com/Project-OSRM/osrm-backend/issues/4584

This one looks especially promising:
https://github.com/Project-OSRM/osrm-backend/issues/3550

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

It is obvious that, in the physical world, you are not running out of RAM.  Logs show the device using a max of 18%.

---- end extract ----

So, with only 18% RAM being used and an OOM being caused, perhaps there is a bug somewhere.

I'm not sure what else I can try.

Many thanks,

Neil.

danpat commented 5 years ago

@neilRGS Hmm, I'm not sure I agree with the Rackspace engineer's assessment here:

you are not running out of RAM. Logs show the device using a max of 18%.

However, the Linux kernel is convinced you're out of RAM - that's why the OOM killer is being invoked - there's no RAM left, so it kills non-critical processes to free up space.

However, it could be that their logging is not detailed enough, and it's missing the large memory consumption spike that's triggering the OOM killer situation.

It's going to take me a while to get something set up to try to reproduce and debug this. A couple of things I could think to try:

  1. See if it's being caused by some specific part of the map - slice the planet into smaller parts and try contracting them - maybe the same problem will occur in a specific geography.
  2. Use the osrm-extract/osrm-partition/osrm-customize pipeline to generate MLD routing data instead of CH (osrm-contract). Route queries will be about 10x slower, but still pretty fast.
neilRGS commented 5 years ago

HI @danpat Here is the result of running the contract with 8 threads as opposed to 16 - I thought I'd give that a go to see what came out of it.

Rather than an OOM, this time there was a segfault.

Here's Rackspace message to me, and if there are any clues in there for you, then all the better:

It appears the application failed this time due to a segfault, not a OOM issue as before. 
While memory and CPU usage was high I don't see anything that would be maxing out the server. 
I have posted the error message and the resource logs from before and after the application crash below, and please let us know if you have any other questions.

[Tue Jan  8 14:18:11 2019] osrm-contract[6249]: segfault at 7fe1b9729008 ip 00000000004aa86d sp 00007fff6df06d40 error 4 in osrm-contract[400000+129000]

Before application crash:
Top 10 cpu using processes
02:10:07 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:     1002      6249  783.79    1.50    0.00  785.29     -  osrm-contract --threads 8 planet-latest.osrm 
Average:        0     21594    0.00    0.25    0.00    0.25     -  pidstat -l 2 2 

Top 10 memory using processes
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
myuser  6249  776 23.7 127162364 125438920 ? R<l  Jan07 13032:35 osrm-contract --threads 8 planet-latest.osrm
root     21137  1.0  0.0 1781492 285148 ?      Sl   14:00   0:06 savscand --incident=unix://tmp/incident --namedscan=unix://root@tmp/namedscansprocessor.18 --ondemandcontrol=socketpair://58/59 socketpair://56/57 --threads=5
root      4803  0.1  0.0 4842404 194200 ?      Sl   Jan02  15:02 /opt/dell/srvadmin/sbin/dsm_om_connsvcd -run
mysql     3340  0.0  0.0 1173208 153248 ?      Ssl  Jan02   5:13 /usr/sbin/mysqld
root      3885  0.0  0.0 3245336 118244 ?      Ssl  Jan02   5:23 /opt/commvault/Base/cvd
root      4177  0.0  0.0 107144 27732 ?        Ss   Jan02   0:16 /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf
root      3974  0.0  0.0 1085900 21160 ?       Ssl  Jan02   4:40 /opt/dell/srvadmin/sbin/dsm_sa_datamgrd
root      2793  0.0  0.0 805996 16372 ?        Ssl  Jan02   0:35 /usr/lib/snapd/snapd
root      4713  0.0  0.0 684092 14432 ?        Ss   Jan02   0:00 /opt/dell/srvadmin/sbin/dsm_sa_datamgrd
root      3660  0.0  0.0 512400 14176 ?        Sl   Jan02   4:49 /usr/bin/python3 /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b

After application crash:
Top 10 cpu using processes
02:20:08 PM   UID       PID    %usr %system  %guest    %CPU   CPU  Command
Average:        0     22128    0.25    0.50    0.00    0.75     -  pidstat -l 2 2 
Average:      116      3632    0.00    0.25    0.00    0.25     -  /usr/sbin/snmpd -LS0-3d -Lf /dev/null -u snmp -g snmp -I -mteTrigger mteTriggerConf -p /var/run/snmpd.pid -a 
Average:        0      3606    0.00    0.25    0.00    0.25     -  savd etc/savd.cfg 
Average:        0      3565    0.25    0.00    0.00    0.25     -  /usr/sbin/cpufreqd -f /etc/cpufreqd.conf 

Top 10 memory using processes
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     21137  0.5  0.0 1855224 285148 ?      Sl   14:00   0:06 savscand --incident=unix://tmp/incident --namedscan=unix://root@tmp/namedscansprocessor.18 --ondemandcontrol=socketpair://58/59 socketpair://56/57 --threads=5
root      4803  0.1  0.0 4842404 194200 ?      Sl   Jan02  15:03 /opt/dell/srvadmin/sbin/dsm_om_connsvcd -run
mysql     3340  0.0  0.0 1173208 153248 ?      Ssl  Jan02   5:14 /usr/sbin/mysqld
root      3885  0.0  0.0 3245336 118244 ?      Ssl  Jan02   5:24 /opt/commvault/Base/cvd
root      4177  0.0  0.0 107144 27732 ?        Ss   Jan02   0:16 /usr/bin/perl /usr/share/webmin/miniserv.pl /etc/webmin/miniserv.conf
root      3974  0.0  0.0 1085900 21160 ?       Ssl  Jan02   4:40 /opt/dell/srvadmin/sbin/dsm_sa_datamgrd
root      2793  0.0  0.0 805996 19524 ?        Ssl  Jan02   0:35 /usr/lib/snapd/snapd
root      4713  0.0  0.0 684092 14432 ?        Ss   Jan02   0:00 /opt/dell/srvadmin/sbin/dsm_sa_datamgrd
root      3660  0.0  0.0 512400 14176 ?        Sl   Jan02   4:50 /usr/bin/python3 /usr/bin/fail2ban-server -s /var/run/fail2ban/fail2ban.sock -p /var/run/fail2ban/fail2ban.pid -x -b
root      3886  0.0  0.0 855228 12912 ?        Ssl  Jan02   1:12 /opt/commvault/Base/ClMgrS

All the best,

Neil.

neilRGS commented 5 years ago

Hi @danpat

The extract/partition/customize pipeline has worked.

I used the .osrm file I had already extracted and the partition and customize steps were completed in around 5 hours.

Routing using the foot profile from Cape Town to Alexandria (so all the way up Africa) rendered to screen in around 20 seconds. This is likely to be adequate for our client's users, as they mainly cycle or walk in chunks of 100-200km or so.

I noted on the Running OSRM page ( https://github.com/Project-OSRM/osrm-backend/wiki/Running-OSRM ) that your are shifting towards MLD, as opposed to CH anyway, so I guess I have moved in the right direction, albeit as a result of the CH contraction not working for me.

I'd like to thank you for your help and advice with this. It is a great relief to be able to let our client loose on it now.

If you're interested, PM me and I will give you some details so you can see what we have done.

All the best,

Neil.