opentripplanner / OpenTripPlanner

An open source multi-modal trip planner
http://www.opentripplanner.org
Other
2.15k stars 1.02k forks source link

OTP 0.19.x stops linking is much more slower and memory intensive than 0.18.x #2170

Closed kalon33 closed 3 years ago

kalon33 commented 8 years ago

When building a graph with OTP 0.19.x compared with OTP 0.18.x and same data and parameters, OTP needs more memory (at least I need to allow at least 7Gb instead of 6Gb of RAM), and is much slower (from 120min under 0.18.x to approx 6hours! As it also saturate my 8Gb of RAM). What I observe is a bigger graph too (it seems to double).

Is that expected?

As a workaround, I currently enabled the option to use the transfers.txt file, but as it's not available on all the GTFS datasets I use, and as when its available the coverage is not perfect, I would prefer use OTP linking system.

abyrd commented 8 years ago

Six hour build times are never expected. Even 120 minutes is extremely high. That kind of very slow behavior is usually due to memory pressure (the JVM spending too much time on garbage collection). It is possible (though not noticed until now) that 0.19 requires more memory than 0.18, and if it does not have the necessary amount of memory it will become very slow.

If the graph is twice as big something seems to be amiss though. Have you done this comparison in a controlled experiment where all inputs are exactly the same? The only thing changing is the version of OTP?

kalon33 commented 8 years ago

yes, the only thing which change between the two conditions is OTP version. 0.18.x doesn't seem to saturate memory at any time (it uses half of the 6Gb I allocate to it, but uses one of my 2.4Ghz cores at 100%), whereas 0.19.x does. Graph size is 2.8Gb on 0.19.x (for Paris region in France), which is even pretty memory intensive when using the graph (and too much for my server), whereas it is only 521M on 0.18.x.

I already had to disable matchBusRoutesToStreets in 0.18.x because it is pretty memory intensive too (graph building is also slower) and has a big impact on graph size, unfortunately.

On 0.18.x I tested again with my latest batch of data and here is what I get at linking:

13:09:37.975 INFO (DirectTransferGenerator.java:62) Creating direct transfer edges between stops using the street network from OSM... Nov 24, 2015 1:57:47 PM org.geotools.referencing.factory.DeferredAuthorityFactory disposeBackingStore INFO: Disposing class org.geotools.referencing.factory.epsg.ThreadedHsqlEpsgFactory backing store 16:50:14.244 INFO (DirectTransferGenerator.java:102) Done connecting stops to one another. Created a total of 355718 transfers from 41554 stops. 16:50:14.319 INFO (Graph.java:938) Summary (number of each type of annotation): 16:50:14.371 INFO (Graph.java:944) TurnRestrictionBad - 63 16:50:14.371 INFO (Graph.java:944) StopUnlinked - 685 16:50:14.372 INFO (Graph.java:944) TurnRestrictionException - 1 16:50:14.372 INFO (Graph.java:944) StopNotLinkedForTransfers - 1244 16:50:14.372 INFO (Graph.java:944) TurnRestrictionUnknown - 11 16:50:14.372 INFO (Graph.java:944) LevelAmbiguous - 804 16:50:14.372 INFO (Graph.java:944) ParkAndRideUnlinked - 8 16:50:14.372 INFO (Graph.java:944) RepeatedStops - 2559 16:50:14.372 INFO (Graph.java:944) GraphConnectivity - 5223 16:50:14.372 INFO (Graph.java:944) HopZeroTime - 293692 16:50:14.372 INFO (Graph.java:944) Graphwide - 1 16:50:14.372 INFO (Graph.java:944) HopSpeedFast - 984 16:50:14.372 INFO (Graph.java:944) HopSpeedSlow - 2517 16:50:14.637 INFO (Graph.java:794) Main graph size: |V|=1114132 |E|=2925689 16:50:14.637 INFO (Graph.java:795) Writing graph /var/www/transports/temp/OpenTripPlanner/pdx/paris/Graph.obj ... 16:51:01.858 INFO (Graph.java:833) Graph written. 16:51:01.901 INFO (GraphBuilder.java:174) Graph building took 235.4 minutes.

Any ideas to solve these problems or improve the situation?

Thanks for your help.

Nicolas.

----- Original Message -----

From: "Andrew Byrd" notifications@github.com To: "opentripplanner/OpenTripPlanner" OpenTripPlanner@noreply.github.com Cc: "Nicolas Derive" kalon33@ubuntu.com Sent: Tuesday, November 24, 2015 12:13:42 PM Subject: Re: [OpenTripPlanner] OTP 0.19.x stops linking is much more slower and memory intensive than 0.18.x (#2170)

Six hour build times are never expected. Even 120 minutes is extremely high. That kind of very slow behavior is usually due to memory pressure (the JVM spending too much time on garbage collection). It is possible (though not noticed until now) that 0.19 requires more memory than 0.18, and if it does not have the necessary amount of memory it will become very slow.

If the graph is twice as big something seems to be amiss though. Have you done this comparison in a controlled experiment where all inputs are exactly the same? The only thing changing is the version of OTP?

— Reply to this email directly or view it on GitHub .

kalon33 commented 8 years ago

After another test with latest data to confirm, I have a 578Mb graph with OTP 0.18.x and a 2.8Gb graph with OTP 0.19.x, with the same batch of data and parameters.

That's a bit weird to have such a difference with no parameter change, isn't it?

abyrd commented 8 years ago

Yes, it seems very suspect to me. Please let us know all the input files you are using so we can reproduce the situation. Is this the current STIF GTFS with Metro Extracts Ile-de-France for example?

kalon33 commented 8 years ago

@abyrd it is île de france PBF extract from http://download.geofabrik.de/europe/france/ile-de-france-latest.osm.pbf

With these GTFS files :

http://opendata.stif.info/explore/dataset/offre-horaires-tc-gtfs-idf/files/26ad996caedfa64338a59286d81ad797/download/

http://medias.sncf.com/sncfcom/open-data/gtfs/export-INTERCITES-GTFS-LAST.zip

http://medias.sncf.com/sncfcom/open-data/gtfs/export-TER-GTFS-LAST.zip

https://api.idbus.com/gtfs.zip

// build-config.json { // matchBusRoutesToStreets: true, parentStopLinking: true, subwayAccessTime: 3.0 // useTransfersTxt: true }

Do you obtain the same thing on your side? (I fixed a typo in my previous comment, graph size was 578Mb with 0.18 and 2.8Gb with 0.19, which makes it unusable on my server, and a pity to build)

Thanks for your help.

abyrd commented 8 years ago

You're using the Intercités and TER feeds that cover all of France, so maybe the problem has to do with the handling of routes outside the PBF's geographic region. Can you try with just the STIF feed and see if the results are less crazy?

kalon33 commented 8 years ago

@abyrd Unfortunately, a 2.8Gb graph is not less crazy... with nearly 10 hour building... :( (with only pbf file and STIF feed, under 0.19 version)

abyrd commented 8 years ago

Even on 0.18.x you're reporting almost 4 hours finding transfers. This seems completely unacceptable to me. As is the 5x increase in serialized graph size. Even if it's storing more information about the transfers for instance, it only reports "355718 transfers from 41554 stops" which should not account for several gigabytes of additional storage. I agree, this needs to be addressed before the next release.

kalon33 commented 8 years ago

Please also note that I had a similar problem with the "matchBusRoutesToStreets: true" instruction. Thanks for working on this.

kalon33 commented 8 years ago

Hi @abyrd , do you have any news on this problem?

Thanks :)

abyrd commented 8 years ago

Hi @kalon33, no, I haven't worked on this one. I do consider it a potentially serious problem but I haven't gone through the process of replicating and observing the problem yet.

pieterbuzing commented 8 years ago

Hi @kalon33, I have tried to reproduce your results, but I failed. Just to verify that I'm doing everything correctly: The file ile-de-france-latest.osm.pbf is 226 MB big and the stif zip file is 53 MB. I've used your build-config file. When I go back to the tag "otp-0.18.0" it takes 6.6 minutes to create a graph of 511 MB. At the HEAD of master (but also at the 0.19 tag) it takes 6.7 minutes to create a graph of 544 MB. It even works when I run it with -Xmx4G (instead of 12G), though slightly slower.

So I'm not even close to your observations. Do you still see this behavior? Perhaps you can give more details about your system and maybe a profiler log.

Anyway, with a small code change in the SimpleStreetSplitter I managed to get the build time down to 4.0 minutes. I will look a bit further for more gains.

abyrd commented 3 years ago

The transit data model and router are completely rewritten in OTP2. Closing.