Project-OSRM / osrm-backend

Open Source Routing Machine - C++ backend
http://map.project-osrm.org
BSD 2-Clause "Simplified" License
6.42k stars 3.4k forks source link

Question re osrm-contract processing time (planet) #4389

Closed neilRGS closed 7 years ago

neilRGS commented 7 years ago

Hi, I have an EC2 spun up, of type i3.16xlarge (64 vCPUs & 488GB RAM) I have a 1TB stxxl file, just to be on the safe side, and a 1TB HDD.

I started osrm-extract yesterday (Tuesday) at 10am. The extract took approx 12 hrs, which I believe to be about right.

I started the osrm-contract process about 10mins after the extract had completed. It has been running for approx 15.5 hours now, and has reached the stage saying it is "preprocessing 428071537 nodes" ... <[0m

cmdline output attached: image

Does this look normal? If so, bearing in mond the 15+hrs to get to this stage, what might the remaining time be?

Many thanks in advance,

Neil.

daniel-j-h commented 7 years ago

With 488 GB RAM you don't need stxxl at all. If you're using 5.9, 5.10 or master you should get a non-stxxl built by default.

neilRGS commented 7 years ago

Thanks Daniel. I think I'm using 5.8

Were I to upgrade it to the latest version, would I have to do the extract again? As you can imagine, that EC2 costs about $11/hr, so I am wary about stacking up my client's bill.

All the best,

Neil.

daniel-j-h commented 7 years ago

Yep you need to extract again. We only provide dataset compatibility on a patch release basis (think: 5.8.0 vs 5.8.1). What you should probably do is store the generated osrm files on s3.

Your contraction seems to take quite a while, check CPU (htop) and I/O (iotop) something looks off.

neilRGS commented 7 years ago

It's running on Windows and there's little to no disk activity. osrm-contract is using 58MB, and it actually looks like it may have halted.

N.

neilRGS commented 7 years ago

Here's the resource monitor window: image

N.

daniel-j-h commented 7 years ago

Hm that does not look good. I don't have much experience with Windows but can you attach a debugger and check where it is hanging? I recommend you spawning up a Linux box, running on master and storing the osrm files on s3.

neilRGS commented 7 years ago

I think I will have to do that. The i3.16xlarge comes with a ton of free, temporary storage, which I will use for the files - I will be moving them to a dedi box we have, when done, and then stopping the EC2.

What's the normal sort time time to complete, using a box with enough RAM and storage space? I know it will depend largely on the machine, but a general idea will do, as I'll need to tell the boss something. :)

daniel-j-h commented 7 years ago

The demo server does daily processing - if you're on master and have enough RAM you should be able to speed the processing up a bit by not using stxxl. Max. 24 hours I would say as an upper bound.

neilRGS commented 7 years ago

Thanks Daniel. That's helpful info.

I'll get cracking.

I haven't looked at the way OSRM will compile on Linux yet, but would a Docker image be a better option? Is there a Docker image yet for v5.10? If not, I may compile OSRM locally, then uplift it to the Linux EC2 I'll need to create.

daniel-j-h commented 7 years ago

Yep we automatically build Docker images for each release and for latest (master). There might be some overhead going through Docker, I would either use the pre-built binaries we ship with the Node.js bindings:

npm install osrm
ls node_modules/osrm/lib/binding/osrm-*

or compile master on your own

https://github.com/Project-OSRM/osrm-backend/wiki/Building-on-Ubuntu

or use mason for dependency management:

cmake .. -DCMAKE_BUILD_TYPE=Release -DENABLE_MASON=On
neilRGS commented 7 years ago

Thanks Daniel.

I'll try compiling master first.

Many thanks,

Neil.

neilRGS commented 7 years ago

Hi Daniel. Just a quick check, as there isn't anything which tells me what to expect to see from this latest version of osrm-extract etc. I have successfully built osrm and have run the osrm-extract command. I see the following output in my console:

[info] Using script profiles/foot.lua
[info] Input file: planet-latest.osm.pbf
[info] Profile: foot.lua
[info] Threads: 64
[info] Parsing in progress..
[info] Using profile api version 2
[info] input file generated by planet-dump-ng 1.1.4
[info] timestamp: 2017-08-07T01:59:59Z
[info] Using profile api version 2

The last line is repeated a good number of times and it seems to have settled. I'll open another shell session to monitor CPU and IO, but initially, does what I am seeing look like what it should be doing?

Many thanks,

Neil.

daniel-j-h commented 7 years ago

Here's the memory profile for extract https://github.com/Project-OSRM/osrm-backend/issues/4288

The Using profile.. message is repeated numberOfThreads times because we're using multiple Lua VMs (one per thread) to pass OSM objects from the pbf file through the profiles (e.g. car.lua).

neilRGS commented 7 years ago

Great, thanks. That makes sense as I have 64 repetitions. :) iotop is showing numbers, so all looking good so far :)

neilRGS commented 7 years ago

Hi Daniel. I hope you are well.

I have extracted and contracted and am now left with the set of files listed below. Which of them are actually required for use with osrm-routed please?


42G Aug 10 13:29 planet-latest.osrm
11G Aug 10 13:29 planet-latest.osrm.cnbg
3.2G Aug 10 12:52 planet-latest.osrm.cnbg_to_ebg
65K Aug 10 13:46 planet-latest.osrm.datasource_names
17G Aug 10 13:42 planet-latest.osrm.ebg
5.6G Aug 10 13:42 planet-latest.osrm.ebg_nodes
5.8G Aug 10 13:27 planet-latest.osrm.edges
1.6G Aug 10 13:37 planet-latest.osrm.enw
20G Aug 10 13:42 planet-latest.osrm.fileIndex
21G Aug 10 13:28 planet-latest.osrm.geometry
3.7G Aug 10 13:28 planet-latest.osrm.icd
155M Aug 10 12:07 planet-latest.osrm.names
12G Aug 10 13:42 planet-latest.osrm.nbg_nodes
2.3K Aug 10 12:07 planet-latest.osrm.properties
79M Aug 10 13:42 planet-latest.osrm.ramIndex
16 Aug 10 12:07 planet-latest.osrm.restrictions
28 Aug 10 12:07 planet-latest.osrm.timestamp
16 Aug 10 13:27 planet-latest.osrm.tld
36 Aug 10 13:27 planet-latest.osrm.tls
1.7G Aug 10 13:27 planet-latest.osrm.turn_duration_penalties
9.9G Aug 10 13:27 planet-latest.osrm.turn_penalties_index
1.7G Aug 10 13:27 planet-latest.osrm.turn_weight_penalties

Many thanks,

Neil.

neilRGS commented 7 years ago

To clarify my question a little more, I have built for the foot profile, so are the turn_duration, turn_penalties and turn_weight_penalties files, totalling 13.3GB actually required?

Many thanks,

Neil.

neilRGS commented 7 years ago

...And one final question, which I haven't been able to find an answer to, yet:

If I run osrm-routed on different ports (for different profiles), how should my endpoints look?

For example: foot: osrm-routed -p 5000 E:\OSM\Planet\foot\planet.osrm bicycle: osrm-routed -p 5001 E:\OSM\Planet\bicycle\planet.osrm

Do the http endpoints follow the ports?: EG:

foot:http://routing.rgsit.com:5000/route/v1/foot/13.388860,52.517037;13.385983,52.496891?steps=true bicycle:http://routing.rgsit.com:5001/route/v1/bicycle/13.388860,52.517037;13.385983,52.496891?steps=true

Many thanks,

Neil.

danpat commented 7 years ago

The [profile] name in the URL is actually ignored - it's only included to make it easier to stick osrm-routed behind a reverse proxy. One osrm-routed instance will serve routes from whatever datafile you give it, regardless of the[profile] name in the URL.

The .turn_weight_penalties and .turn_duration_penalties files are currently required, the .turn_penalties_index is not.

daniel-j-h commented 7 years ago

@neilRGS here's a the file list, hope that helps:

https://github.com/Project-OSRM/osrm-backend/wiki/Toolchain-file-overview

daniel-j-h commented 7 years ago

Closing here - feel free to re-open or comment if there's anything left we can help you with.

sivatharan commented 6 years ago

How to find OSM nodes ? https://github.com/Project-OSRM/osrm-backend/wiki/Traffic

daniel-j-h commented 6 years ago

Use osmium / pyosmium to walk over OpenStreetMap. Then you have access to all node and way attributes, such as ids, locations and tags. Good luck.

sivatharan commented 6 years ago

Thanks for your reply, I've another question. Let's say I've the traffic data in my PostgreSQL (speed limit for each location road{lat and long}), now how can I incorporate that into osrm traffic? and will it be possible to have a conference call with you (we're really looking for an expert in OSRM for our application development) ?

daniel-j-h commented 6 years ago

https://github.com/Project-OSRM/osrm-backend/wiki/Traffic

gopal-mp commented 1 month ago

Hi guys, I need help setting up OSRM backend on windows server 2022. I have done the extraction process with 64 GB RAM and 8 core processor with 260 GB of virtual memory and in the contract I am getting error (insufficient memory). Then I scaled up the system (128GB RAM with 16 Core processor) but still same error. I am wondering, extraction happened successfully but contract process is not happening.

Your help would be really appreciated.