opentripplanner / OpenTripPlanner

An open source multi-modal trip planner
http://www.opentripplanner.org
Other
2.12k stars 1.01k forks source link

Batch Analyst consumes all available memory on certain origins #1623

Closed abyrd closed 3 years ago

abyrd commented 9 years ago

Reported by Manuel Timita on the users mailing list:

I just discovered by chance that only some origins (really tough to identify exactly which ones) cause OTP to simply start clogging the RAM until the process is killed.

Since using Batch Analyst as described above didn't work, I tried a different approach. I simply ran the otp JAR with --server and --analyst switches, then iterate through all the points and scrape GeoJSON isochrones via the API, which I then intend to use for spatial intersection in QGIS or PostGIS. It works for the majority of points, but every now and then it stops and I can see the memory usage growing really quickly until it gets killed - symptom observed first when I ran the Batch Analyst. Example: it runs the first 2,000 points very quickly, with minimal memory usage (besides the memory required to load the graph, of course), then it stops and grows until the maximum RAM available is reached, and dies (the largest instance I tried had 90GB RAM). Removed the first couple hundred points from the CSV after those first 2,000 (the ones that worked), then scraping works again, but only for a while: it clogs the RAM and gets killed a few hundred origins later.

It looks like something related the actual coordinates (and, perhaps, the OSM data available nearby) causes this.

abyrd commented 9 years ago

also see #1620, potentially related.

manueltimita commented 9 years ago

As I mentioned in my last reply to the original thread here, I switched to two different, smaller, datasets and a smaller Graph object, to increase the chance of isolating the issue.

Server specs:


OTP branch 0.11.x / Hardware: 32GB RAM, 20GB SSD, 12 Cores.

The settings in otp-batch-analyst:


java -Xmx31G -cp `dirname $0`/otp-core/target/otp.jar org.opentripplanner.analyst.batch.BatchProcessor $1

The settings in batch-context.xml: see the file as text here

The graph has been built using the London OSM extract on geofabrik: more info and download here

The origins CSV file contains 1000 coordinate pairs: download the file here

The destination CSV file contains 2000 coordinate pairs: download the file here

I started the Batch Analyst with the command ./otp-batch-analyst ~/BatchAnalyst/batch-context.xml. It didn't take long until the memory leak could be seen in htop. Here's the output in the console from the start, until OTP got killed by the OS:


14:17:08.990 INFO (MavenVersion.java:53) Parsed Maven artifact version: 0.11.1-SNAPSHOT => MavenVersion(0, 11, 1, SNAPSHOT, f361fa176a6d7d750052bb34d505a6de78a0f35f) UID=-11001
14:17:09.012 INFO (Graph.java:558) Graph version: MavenVersion(0, 11, 1, SNAPSHOT, f361fa176a6d7d750052bb34d505a6de78a0f35f)
14:17:09.012 INFO (Graph.java:559) OTP version:   MavenVersion(0, 11, 1, SNAPSHOT, f361fa176a6d7d750052bb34d505a6de78a0f35f)
14:17:09.013 INFO (Graph.java:576) This graph was built with the currently running version and commit of OTP.
14:17:58.415 INFO (Graph.java:528) Main graph read. |V|=283067 |E|=752365
14:18:00.408 INFO (GraphUpdaterConfigurator.java:69) Using configurations:
14:18:02.528 INFO (BatchProcessor.java:224) linking population org.opentripplanner.analyst.batch.CSVPopulation@4664a2bd to the graph...
14:18:03.157 INFO (BatchProcessor.java:233) successfully linked 2000 individuals out of 2000
14:18:03.157 INFO (BatchProcessor.java:116) Number of threads: 12
14:18:03.291 INFO (BatchProcessor.java:143) created 1000 tasks.
14:18:05.968 INFO (BatchProcessor.java:178) received 1 results out of 1000
14:18:05.968 INFO (BatchProcessor.java:179) running 0 min, 46 min remaining (projected)
14:18:09.970 INFO (BatchProcessor.java:178) received 67 results out of 1000
14:18:09.970 INFO (BatchProcessor.java:179) running 0 min, 1 min remaining (projected)
14:18:14.011 INFO (BatchProcessor.java:178) received 160 results out of 1000
14:18:14.011 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:18.043 INFO (BatchProcessor.java:178) received 258 results out of 1000
14:18:18.043 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:22.104 INFO (BatchProcessor.java:178) received 355 results out of 1000
14:18:22.104 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:27.359 INFO (BatchProcessor.java:178) received 448 results out of 1000
14:18:27.360 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:31.461 INFO (BatchProcessor.java:178) received 521 results out of 1000
14:18:31.462 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:35.506 INFO (BatchProcessor.java:178) received 602 results out of 1000
14:18:35.506 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:43.063 INFO (BatchProcessor.java:178) received 623 results out of 1000
14:18:43.063 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:47.107 INFO (BatchProcessor.java:178) received 671 results out of 1000
14:18:47.108 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:18:51.125 INFO (BatchProcessor.java:178) received 700 results out of 1000
14:18:51.125 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:19:01.997 INFO (BatchProcessor.java:178) received 710 results out of 1000
14:19:01.998 INFO (BatchProcessor.java:179) running 0 min, 0 min remaining (projected)
14:19:10.866 INFO (BatchProcessor.java:178) received 724 results out of 1000
14:19:10.866 INFO (BatchProcessor.java:179) running 1 min, 0 min remaining (projected)
14:19:14.899 INFO (BatchProcessor.java:178) received 773 results out of 1000
14:19:14.899 INFO (BatchProcessor.java:179) running 1 min, 0 min remaining (projected)
14:19:26.466 INFO (BatchProcessor.java:178) received 795 results out of 1000
14:19:26.466 INFO (BatchProcessor.java:179) running 1 min, 0 min remaining (projected)
14:19:30.604 INFO (BatchProcessor.java:178) received 839 results out of 1000
14:19:30.604 INFO (BatchProcessor.java:179) running 1 min, 0 min remaining (projected)
14:19:47.771 INFO (BatchProcessor.java:178) received 873 results out of 1000
14:19:47.771 INFO (BatchProcessor.java:179) running 1 min, 0 min remaining (projected)
14:20:07.720 INFO (BatchProcessor.java:178) received 900 results out of 1000
14:20:07.720 INFO (BatchProcessor.java:179) running 2 min, 0 min remaining (projected)
14:20:11.799 INFO (BatchProcessor.java:178) received 942 results out of 1000
14:20:11.799 INFO (BatchProcessor.java:179) running 2 min, 0 min remaining (projected)
14:20:16.756 INFO (BatchProcessor.java:178) received 985 results out of 1000
14:20:16.756 INFO (BatchProcessor.java:179) running 2 min, 0 min remaining (projected)
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007fce61880000, 2317352960, 0) failed; error='Cannot allocate memory' (errno=12)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 2317352960 bytes for committing reserved memory.
# An error report file with more information is saved as:
# /root/OpenTripPlanner/hs_err_pid3400.log

The hs_err_pid3400.log: see the file here

Here's a screencap from htop, taken seconds before OS kills OTP: otp-htop

However, there is a difference compared to the attempt described by Andrew Byrd, above. When I attempted to scrape the isos in GeoJSON format, after starting the server with ./otp --server --analyst, I encountered no issues. I tried both the origins and the destinations files and I got the results back within seconds, without abnormal memory consumption (it stayed under 2.5GB).

abyrd commented 3 years ago

Analysis functionality removed from OTP2 and now developed in https://github.com/conveyal/r5