conveyal / otpa-cluster

Cluster backend for otpa many-to-many queries.
0 stars 1 forks source link

thin workers #44

Open mattwigway opened 9 years ago

mattwigway commented 9 years ago

If we are using repeated RAPTOR, we could have workers that just grab timetables from the graph, rather than building the entire graph. This would mean that we would not need to store state on the workers.

mattwigway commented 9 years ago

hat tip: @abyrd

abyrd commented 9 years ago

To elaborate a bit on the idea: if we're using RAPTOR and we assign integer IDs to all patterns and stops (rather than using a hybrid approach where the patterns and stops are object references) then the routing computation can be done "blindly" with no additional data about what the numbers being crunched actually mean. The workers wouldn't have a dependency on OTP at all, just a single class describing how to do the computation on big tables of numbers. This would greatly simplify distributed computation across many machines since we wouldn't need to send / build graphs on those machines, just some compressed representation of the relevant timetables.

abyrd commented 9 years ago

Initial implementation in https://github.com/opentripplanner/OpenTripPlanner/commit/eab34191f339620ae7a73e20ce579bfcaeb5003 60-minute time window true profile routing in midtown Manhattan takes ~2 sec. Serialized data chunk is 200MB / 60MB gzipped, which is strangely similar to the size of an entire Graph.

mattwigway commented 9 years ago
    How long does it take to load the data?

    --        Matthew Wigginton Conway       Transportation Analytics/Open SourceWashington, DCindicatrix.org
     ---- On Thu, 23 Apr 2015 22:00:36 -0400  notifications@github.com  wrote ----Initial implementation in eab34191f339620ae7a73e20ce579bfcaeb5003c

60-minute time window true profile routing in midtown Manhattan takes ~2 sec. Serialized data chunk is 200MB / 60MB gzipped, which is strangely similar to the size of an entire Graph. —Reply to this email directly or view it on GitHub.

abyrd commented 9 years ago

Never tried. I just serialized it thinking it would be small and it isn't for some reason. There's a lot of redundant data though (bidirectional indexes etc.).

mattwigway commented 9 years ago

I'm not too concerned about file size as we'll drop everything into s3 and retrieval is almost instantaneous. But if we can get serialization time down to be nearly negligible we can rip out most off the executive (which manages which workers have which graph) and replace it with a standard queuing library, eg Amazon sqs.

    --        Matthew Wigginton Conway       Transportation Analytics/Open SourceWashington, DCindicatrix.org
     ---- On Thu, 23 Apr 2015 22:03:27 -0400  notifications@github.com  wrote ----Never tried. I just serialized it thinking it would be small and it isn't for some reason. There's a lot of redundant data though (bidirectional indexes etc.).

—Reply to this email directly or view it on GitHub.

abyrd commented 9 years ago

@mattwigway three seconds.

mattwigway commented 9 years ago
    Ok, this will facilitate a major rethinking and simplification of all this code. Awesome.

    --        Matthew Wigginton Conway       Transportation Analytics/Open SourceWashington, DCindicatrix.org
     ---- On Thu, 23 Apr 2015 22:06:56 -0400  notifications@github.com  wrote ----@mattwigway three seconds.

—Reply to this email directly or view it on GitHub.

abyrd commented 9 years ago

It might get a little slower when some nuance is added to the routing (might), but we're starting at around 1-2.5 seconds for most of NY. This thing could run with no graph and no OTP jar, the data files are self-contained, and it seems to only need a few hundred MB of breathing room (RAM) when operating. It would be pretty straightforward to scale this to large numbers of workers. It would be realistic to get full multi-origin results for NY in say 6 minutes.

mattwigway commented 9 years ago

We could even use AWS lambda.

    --        Matthew Wigginton Conway       Transportation Analytics/Open SourceWashington, DCindicatrix.org
     ---- On Thu, 23 Apr 2015 22:10:27 -0400  notifications@github.com  wrote ----It might get a little slower when some nuance is added to the routing (might), but we're starting at around 1-2.5 seconds for most of NY. This thing could run with no graph and no OTP jar, the data files are self-contained, and it seems to only need a few hundred MB of breathing room (RAM) when operating. It would be pretty straightforward to scale this to large numbers of workers.

—Reply to this email directly or view it on GitHub.

abyrd commented 9 years ago

Not only could we, it looks perfect. Eliminate a lot of special-purpose infrastructure, and no worker startup/shutdown.

abyrd commented 9 years ago

The workers could just save their results and then we could do a gather phase at the end to turn it into one big result.

abyrd commented 9 years ago

Lambda seems to only support node.js

abyrd commented 9 years ago

I spoke too soon. After JIT kicks in the load time seems to be around a second.

mattwigway commented 9 years ago
    The lambda functions are written in JavaScript but can start processes in any programming language. Downside is that we can't cache the data between origins, but if it's that fast maybe t doesn't matter.

    --        Matthew Wigginton Conway       Transportation Analytics/Open SourceWashington, DCindicatrix.org
     ---- On Thu, 23 Apr 2015 22:17:07 -0400  notifications@github.com  wrote ----Lambda seems to only support node.js

—Reply to this email directly or view it on GitHub.