Update traffic data without relaunching the web server "osrm-routed"

abdelhakimbendjabeur commented 4 years ago

Hello,

I am experimenting on osrm locally using docker and I have been trying to customize the graph using custom segment speed files. I only see the effect of my speed when I relaunch the osrm-routed.

Here are the commands I used

wget http://download.geofabrik.de/europe/france/ile-de-france-latest.osm.pbf

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-extract -p /opt/car.lua /data/ile-de-france-latest.osm.pbf

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-partition /data/ile-de-france-latest.osrm

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-customize /data/ile-de-france-latest.osrm

docker run -d -t -i -p 1000:5000 -v "${PWD}:/data" osrm/osrm-backend osrm-routed --algorithm mld /data/ile-de-france-latest.osrm
>>> <container-id>

and now I have a working web server on localhost:1000

Next I do is just get copy the speed file inside the container <container-id> and run the following

// My terminal
docker cp ~/Desktop/speedfile.csv <container-id>:/tmp

// Inside the container
root@79fc04d4079f:/opt# osrm-customize /data/ile-de-france-latest.osrm --segment-speed-file /tmp/speed_file.csv

log output

[info] Loaded /tmp/speed_file.csv with 16values
[info] In total loaded 1 file(s) with a total of 16 unique values
[info] Used 3504488 speeds from LUA profile or input map
[info] Used 16 speeds from /tmp/speed_file.csv
[warn] Speed values were used to update 16 segments for 'routability' profile
[info] Updating segment data took 412.522ms.
[info] In total loaded 0 file(s) with a total of 0 unique values
[info] Done reading edges in 2638.76ms.
[info] Loaded edge based graph: 3342718 edges, 936893 nodes
[info] Loading partition data took 3.77922 seconds
[info] Cells customization took 6.5012 seconds
[info] Cells statistics per level
[info] Level 1 #cells 5050 #boundary nodes 93417, sources: avg. 11, destinations: avg. 16, entries: 1171113 (9368904 bytes)
[info] Level 2 #cells 346 #boundary nodes 14907, sources: avg. 27, destinations: avg. 36, entries: 417244 (3337952 bytes)
[info] Level 3 #cells 21 #boundary nodes 2264, sources: avg. 68, destinations: avg. 86, entries: 154492 (1235936 bytes)
[info] Level 4 #cells 1 #boundary nodes 0, sources: avg. 0, destinations: avg. 0, entries: 0 (0 bytes)
[info] Unreachable nodes statistics per level
[warn] Level 1 unreachable boundary nodes per cell: 0.00475248 sources, 0.00336634 destinations
[warn] Level 2 unreachable boundary nodes per cell: 0.017341 sources, 0.00578035 destinations
[warn] Level 3 unreachable boundary nodes per cell: 0.0952381 sources, 0 destinations
[info] Unreachable nodes statistics per level
[warn] Level 1 unreachable boundary nodes per cell: 0.0255446 sources, 0.020396 destinations
[warn] Level 2 unreachable boundary nodes per cell: 0.147399 sources, 0.118497 destinations
[warn] Level 3 unreachable boundary nodes per cell: 0.571429 sources, 0.47619 destinations
[info] Unreachable nodes statistics per level
[warn] Level 1 unreachable boundary nodes per cell: 0.145347 sources, 0.0972277 destinations
[warn] Level 2 unreachable boundary nodes per cell: 1.10983 sources, 0.713873 destinations
[warn] Level 3 unreachable boundary nodes per cell: 5.28571 sources, 3.14286 destinations
[info] Unreachable nodes statistics per level
[warn] Level 1 unreachable boundary nodes per cell: 0.00475248 sources, 0.00336634 destinations
[warn] Level 2 unreachable boundary nodes per cell: 0.017341 sources, 0.00578035 destinations
[warn] Level 3 unreachable boundary nodes per cell: 0.0952381 sources, 0 destinations
[info] MLD customization writing took 0.97205 seconds
[info] Graph writing took 0.920148 seconds
[info] RAM: peak bytes used: 283684864

Now when I make requests to the /route/v1/driving/ endpoint, the speed file does not seem to be taken into account. But when I kill the container and recreate one, then the speed is taken into account and I have the ETAs as expected.

How can I customize the graph using a segment speed file and observe the change without having to create a new osrm-routed container? (i.e. Is there a way of applying the speed file without having to recreate the container?)
side question: How recommended is this usage in production?

Thank you

danpat commented 4 years ago

The hot-swap mechanism built into OSRM is to use osrm-datastore to load data into shared memory, then run osrm-routed -s to read from shared memory instead of files on disk.

This is difficult to do using Docker, although it is possibls. See this StackOverflow summary of the IPC flags to docker run - https://stackoverflow.com/a/37716609

The recipe would be something like this:


wget http://download.geofabrik.de/europe/france/ile-de-france-latest.osm.pbf

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-extract -p /opt/car.lua /data/ile-de-france-latest.osm.pbf

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-partition /data/ile-de-france-latest.osrm

docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-customize /data/ile-de-france-latest.osrm

# Load the initial data into a shared memory segment on the host (outside the docker container)
docker run --ipc=host "${PWD}:/data" osrm/osrm-backend osrm-datastore /data/ile-de-france-latest.osrm

# Start up osrm-routed and attach it to shared memory rather than the files
docker run -d -t -i -p 1000:5000  osrm/osrm-backend osrm-routed --algorithm mld -s

# Now start your traffic loop
while /bin/true ; do
  sleep 300 # Sleep for 5 minutes
  wget http://somewhere/traffic.csv # EXAMPLE for fetching latest traffic data
  # Update the base files with new traffic data
  docker run -t -v "${PWD}:/data" osrm/osrm-backend osrm-customize /data/ile-de-france-latest.osrm --segment-speed-file traffic.csv
  # Load new data into the shared memory region
  docker run --ipc=host "${PWD}:/data" osrm/osrm-backend osrm-datastore /data/ile-de-france-latest.osrm
done

The trick is the use of osrm-datastore, osrm-routed -s, and the --ipc=host flag to docker run.

A completely alternative approach would be to run a reverse proxy, say, nginx in front of osrm-routed, and using its "load balancer" feature to mark backends as healthy/unhealthy and seamlessly swap between them: https://docs.nginx.com/nginx/admin-guide/load-balancer/http-load-balancer/

This would still require some starting/stopping of osrm-routed, but if you have nginx in front, you can do it without losing any requests.

abdelhakimbendjabeur commented 4 years ago

Thank you for the answer @danpat

wget http://download.geofabrik.de/europe/france/ile-de-france-latest.osm.pbf
docker run --ipc host -v "${PWD}:/data" osrm/osrm-backend osrm-extract -p /opt/car.lua /data/ile-de-france-latest.osm.pbf
docker run --ipc host -v "${PWD}:/data" osrm/osrm-backend osrm-partition /data/ile-de-france-latest.osrm
docker run --ipc host -v "${PWD}:/data" osrm/osrm-backend osrm-customize /data/ile-de-france-latest.osrm

the last command fails

[warn] could not lock shared memory to RAM
terminate called after throwing an instance of 'osrm::util::exception'
  what():  lock file does not exist, exitinginclude/storage/shared_memory.hpp:298

Am I missing something?

And yes, I think we can manage to do this using nginx.

Thanks again

danpat commented 4 years ago

@abdelhakimbendjabeur Hmm, lock file does not exist - from memory, osrm-datastore and osrm-routed expect to find a file in /var/lock or /var/run - you might need to mount that volume from the host filesystem as well.

I've never actually used osrm-datastore and Docker like this. It should work in theory, but I guess this is one of the gotchas.

The traffic-reloading system was designed and built before Docker existed, it's designed to run on non-containerized host operating system.

nhaskins01 commented 4 years ago

I am experiencing this issue as well. However, when I experience it, my service has been up and running for a couple weeks swapping out traffic without any issues. Pretty much the above script with a little more customization around meshing together speed files.

It uses AWS ECS with an EC2 Instance. The instance is a c5n.4xlarge

I am using 2 tasks.
osrm-service-production(1cpu, 3GB, only mounting /tmp directory) - this task runs osrm-routed -s --algorithm mld - which connects to the worker task after its started.
osrm-worker-production(8cpu, 15GB, only mounting /tmp directory) - this task downloads the new data every 5 minutes into one of two directories, depending on which one is actively being used in the datastore.
(set -x; osrm-customize $currentDir/data/texas-latest.osrm $filenames)
osrm-datastore $currentDir/data/texas-latest.osrm

About once a week, it throws this error, which crashes the container. A new task comes online, it starts up the datastore and then I just restart the service tasks, and everything is back to normal.
02:31:30 + osrm-datastore /data/worker/primary/data/texas-latest.osrm
02:31:30 [info] Data layout has a size of 2207 bytes
02:31:30 [info] Allocating shared memory of 556915882 bytes
02:31:30 [info] Data layout has a size of 2248 bytes
02:31:30 [info] Allocating shared memory of 2089789517 bytes
02:31:32 terminate called after throwing an instance of 'osrm::util::exception'
02:31:32 what(): lock file does not exist, exitinginclude/storage/shared_memory.hpp:298

Maybe the issue is that im not mounting/var/lock or /var/run at all. It would be odd for it to run without any issues for about a week though. I have verified that the traffic updates are taking place and reflecting in the osrm-routed service.

Some additional settings on the ec2 instance.

[ec2-user]$ ipcs -l --human

------ Messages Limits --------
max queues system wide = 32000
max size of message = 8K
default max size of queue = 16K

------ Shared Memory Limits --------
max number of segments = 4096
max seg size = 9.3G
max total shared memory = 38.2G
min seg size = 1B

------ Semaphore Limits --------
max number of arrays = 32000
max semaphores per array = 32000
max semaphores system wide = 1024000000
max ops per semop call = 500
semaphore max value = 32767

[ec2-user]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         21G     0   21G   0% /dev
tmpfs            21G  652K   21G   1% /dev/shm
tmpfs            21G  1.9M   21G   1% /run
tmpfs            21G     0   21G   0% /sys/fs/cgroup
/dev/nvme0n1p1  2.9T   14G  2.9T   1% /
tmpfs           4.1G     0  4.1G   0% /run/user/1000

[ec2-user]$ free -m
              total        used        free      shared  buff/cache   available
Mem:          41238        1405       23812        5050       16020       34261
Swap:             0           0           0

[ec2-user]$ ulimit -a|grep max
max locked memory       (kbytes, -l) 10000000
max memory size         (kbytes, -m) unlimited
max user processes              (-u) 4096

[ec2-user]$ cat /etc/security/limits.conf | grep memlock
#        - memlock - max locked-in-memory address space (KB)
*           hard    memlock         unlimited
*           soft    memlock         10000000

I am going to continue tinkering and try and get to the bottom of this as I almost have it stable in a containerized implementation. I have gone over almost all the documentation, but the datastore doesn't have a lot of information on how it works under the hood. However, I noticed in the code it appears to be using the temp directory which is why I originally mounted it. temp dir

johnjinwoua commented 4 years ago

@abdelhakimbendjabeur have you been able to solve your problem ? I am having some issues with the data-store. when I launch the ors'-routed -s I have this error message :

terminate called after throwing an instance of 'osrm::util::exception' what(): No shared memory block 'osrm-region' found, have you forgotten to run osrm-datastore?include/storage/shared_monitor.hpp:83

Could someone help ? @danpat

Thanks

nhaskins01 commented 4 years ago

@johnjinwoua you need to start osrm-datastore before you start osrm-routed if you start them in the reverse order, your osrm-routed instance will have nothing to connect to.

Are you running on an instance or is this a containerized application?

if using docker, you will also need to use the ipc=host setting and mount /tmp from the host on both the osrm-datastore and osrm-routed.

IqbalLx commented 2 years ago

Hi all, maybe it's not directly related to this main topic, but I wondering where can I get latest traffic data? Is it provided by ourself or is there services out there that provided it ? Thanks

github-actions[bot] commented 3 weeks ago

This issue seems to be stale. It will be closed in 30 days if no further activity occurs.

Project-OSRM / osrm-backend

Update traffic data without relaunching the web server "osrm-routed" #5703