Project-OSRM / osrm-backend-docker

DEPRECATED Part of osrm-backend since 5.7. Docker build files for OSRM
BSD 2-Clause "Simplified" License
16 stars 10 forks source link

osrm-routed takes a long time to start accepting requests in this condition. #8

Open sreeramvuppala opened 7 years ago

sreeramvuppala commented 7 years ago

After running the 3 steps : (north-america-latest.pbf)

  1. osrm-extract
  2. osrm-contract
  3. osrm-routed - It takes a few seconds before it begins waiting for requests.

I created an AMI with the extracted data. The problem is when I am creating a new instance from the AMI. Now, I only run osrm-routed passing the data that already exists with the AMI. It takes more than two hours between these two steps in the logs a. load names .... b. set checksum .... (after 2 hours)

The same is not the case if the same instance is rebooted and the osrm-routed docker command is rerun .. there is no delay.

The purpose is to automatically get an instance running and may help in autoscaling.

jessetarot commented 4 years ago

Duplicate of this issue: https://github.com/Project-OSRM/osrm-backend/issues/3407

EBS instances which are restored from snapshots are (behind the scenes) "lazy-loaded" from S3. Therefore, the first read of each "block" is very slow, because the data comes over the network.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html

I'm also looking for an "acceptable" solution....

danpat commented 4 years ago

There aren't too many magic options here - OSRM needs lots of data in order to perform route calculation. The options are:

  1. Do a full read of the EBS volume with dd or some similar tool to fully "hydrate" it before use - this will ensure maximum performance, but startup is still delayed. It's also possible to pre-hydrate EBS volumes separately ahead of time if you know you'll need them for scaling purposes.
  2. If startup time is critical, but query performance is not so much, you can use the --mmap on option for osrm-routed - this will cause OSRM itself to "lazy load" data when queries arrive. Queries will be slower than normal until the filesystem cache has been warmed sufficiently for your use-case. A middle-ground here is to startup with --mmap on, then perform a short "warmup routine" - startup is a little bit delayed, but not as much as waiting for full EBS volume hydration.

There's no free lunch here - OSRM needs the routing graph to be readable in order for it to calculate routes, and if you're routing on the planet/, well, that's a lot of data you need to move around the network.

You can also do things like break up the map into geographically disconnected regions, and boot them as separate servers with smaller data and faster boot up times, but this requires writing a reverse proxy layer that knows how to direct queries appropriately to the right server.