conveyal / analysis-backend

Server component of Conveyal Analysis
http://conveyal.com/analysis
MIT License
23 stars 12 forks source link

Track updated stops separately within and across departure minutes #276

Closed abyrd closed 3 years ago

abyrd commented 4 years ago

The scheduled search and upper bound search benefit from tracking which stops were updated within the current departure minute (or randomized schedule), rather than in the previously evaluated departure minutes. In fact, if we don't do this the range raptor optimization is ineffective. But frequency routes must look at the combined effects of all departure minutes - the frequency route's schedule will have changed in each search, but it may be reached by a scheduled ride that was found in a different departure minute.

This PR builds on #269, aiming to produce results identical to v5.10.1 (accounting for frequency search requirements described above) but regaining the range-raptor speedup. Limited speed testing was done on a medium-sized network, with a scenario adding frequency routes and 500 Monte Carlo draws. Enabling range raptor in this PR gives "only" a 30% speedup over using fresh state at each departure minute, but even without range raptor enabled this new branch is already 68% faster than v5.10.1. So total speedup over v5.10.1 is about 77%, while apparently maintaining identical results. Non-frequency results are obtained just as fast as v5.10.0, and high-draw frequency results are obtained in about 2.25x as long as v5.10.0 (but seem to match the results from v5.10.1, which takes over 4x as long as v5.10.0).

To confirm that results are equivalent, I have run regional analyses in large networks with many low frequency routes added in scenarios. Comparing single-point results between this PR and the tagged v5.10.1, there is practically no difference in travel times and accessibility curves, even at origin points that previously showed noticeable errors in v5.10.0.

Comparing regional analyses, the new version produces identical baseline results to v5.10.0 and v5.10.1. Regional scenario results with added frequency routes show differences in cumulative accessibility that are almost zero everywhere relative to v5.10.1, except in cells with only a few hundred accessible people, which are easily skewed by a few seconds' variation in randomized schedules.

abyrd commented 3 years ago

Merged since all recent changes were small refactors and documentation in response to @ansoncfit 's review.