Impact of walkReluctance on performance and destination pruning, general question on cost

slvlirnoff commented 2 years ago

I've been looking into adding a few performance test cases for Switzerland and I was investigating why one of my test case was significantly slower in a deployment compared to the speed test framework.

I've narrowed it down to a different walkReluctance value. I've seen that it has a lot of impact. See below search case 1-4 are almost not affected, these are rather direct searches with small agress/egress walking path. Search 5 necessitate a medium walk transfer between station in the middle and is relatively affected and search 6 has longer agress/egress edges and is strongly affected.

walkReluctance:
2:
Test case ids       : [  1   2   3   4    5    6]
Number of paths     : [  1   1   1   1    1    1]
Transit times(ms)   : [423 527 618 899 1388 2613]
Total times(ms)     : [429 533 631 902 1390 2616]
3
Test case ids       : [  1   2   3   4    5    6]
Number of paths     : [  1   1   1   1    1    1]
Transit times(ms)   : [416 537 623 908 1694 4870]
Total times(ms)     : [422 542 636 911 1696 4873]
4
Test case ids       : [  1   2   3   4    5    6]
Number of paths     : [  1   1   1   1    1    1]
Transit times(ms)   : [421 533 623 910 2050 6927]
Total times(ms)     : [427 538 636 913 2052 6931]
5
Test case ids       : [  1   2   3   4    5    6]
Number of paths     : [  1   1   1   1    1    1]
Transit times(ms)   : [414 524 619 905 2275 9982]
Total times(ms)     : [421 529 632 908 2277 9986]
6
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [412 522 617 915 2498 13916]
Total times(ms)     : [418 527 630 918 2501 13922]
7
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [411 530 617 923 2662 14104]
Total times(ms)     : [418 535 629 926 2665 14109]
8
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [407 521 606 906 2873 15145]
Total times(ms)     : [413 526 618 909 2876 15149]
9
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [414 535 621 955 3474 17129]
Total times(ms)     : [421 540 634 958 3477 17133]
10
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [413 533 620 936 3485 18250]
Total times(ms)     : [419 538 633 939 3488 18256]
12
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [418 526 615 938 3888 19430]
Total times(ms)     : [424 531 627 941 3892 19435]
14
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [407 526 640 974 4489 23056]
Total times(ms)     : [413 531 656 977 4492 23061]
16
Test case ids       : [  1   2   3   4    5     6]
Number of paths     : [  1   1   1   1    1     1]
Transit times(ms)   : [407 533 631 973 4503 22852]
Total times(ms)     : [414 538 644 976 4506 22857]
18
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [412 527 646 1021 4956 23769]
Total times(ms)     : [418 532 659 1025 4959 23774]
20
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [408 528 641 1009 5271 26265]
Total times(ms)     : [415 534 654 1012 5274 26269]
24
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [418 532 662 1035 5860 28914]
Total times(ms)     : [425 537 675 1038 5863 28919]
28
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [418 547 700 1107 6469 38124]
Total times(ms)     : [424 552 713 1111 6472 38129]
32
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [409 555 703 1115 6870 41832]
Total times(ms)     : [415 560 716 1119 6873 41837]
36
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [404 548 703 1122 7140 46522]
Total times(ms)     : [410 553 716 1125 7143 46527]
40
Test case ids       : [  1   2   3    4    5     6]
Number of paths     : [  1   1   1    1    1     1]
Transit times(ms)   : [402 550 715 1131 8013 49220]
Total times(ms)     : [408 555 728 1134 8017 49225]

As you can see the duration explode quite quickly even for small values, this is a combination of a large search window (8h itinerary) and the following side effect: Heuristic is pruning paths based on duration, cost, etc. for the minimal potential cost, it uses the heuristic duration multiplied by the smallest factor (usually 1 for transit, potentially less for RAIL in some deployment). It's compared against the actual cost of the solutions found so far. As the ratio between minFactor and walkFactor increases the heuristic is more and more optimistic and eventually doesn't prune anything leading to very slow searches. (Interestingly enough increasing walk reluctance will make the algorithm explore potentially more walking transfers because of the heuristic "loosening" impact.)

I'm fine with that however the documentation roughly says: walkReluctance is what should be used to limit walking and experience show that 10/20 are good values. I feel that this should be revisited or a comment should be made on speed performance/heuristic impact.

I've always found strange to have such a high (suggested) factor for walkReluctance, that's seems an artificial way to fix something else (a lower boardCost and walkSpeed could work just as well with lower performance impact?). Also potentially a non-linear cost for walk could be interesting, it is increasingly frustrating/uncomfortable to walk over time, 1,5,10,15mn transfer would not be ranked by the users as 1,2,3,4 on a scale of "painfulness".

Quick fix

Add a comment in the documentation about performance and potentially revist proposed values for walkReluctance.

As other quick fixes: if it's hard to keep track of transfer/transit duration I feel that the initial walking access and egress cost could be added to the heuristic min cost with the correct factor to reduce their impact on the performance overall. Potentially at least as 'the minimum access and egress of all access and egress'. Also is it really relevant to consider the cost in the heuristic pruning? As the minCost from the heuristic is not really correct by default, this could be another potential quick fix.

Also walk reluctance could be more granular, i.e. you might want to avoid transfering by walking if there's a nice transit connection, however walking between platforms of the same parent station could have a different reluctance factor (also to avoid filtering artificial long walk due to poor OSM mapping/platform linking and/or waiting for a later train connection because one train platform is closer to you).

General question of the 'cost'

There is one other aspect behind the scene here that is interesting to discuss : as implemented now the cost is roughly a translation of the number of transfers and the duration and the bests solutions for these two criteria are already considered in the ParetoSet as well as in the algorithm (through round and tripSearch). I guess this comes from AStar routing.

Shouldn't the cost be independent of these values to reflect better penalties or preferred options? That could also be easier to implement a list of fixed costs based on condition: boardCost, changeInStationCost, changeOperatorCost, fastWalkTransfer, longWalkTransfer, badSurface, dangerousArea, lowFrequencyTripCost, highFrequencyTripCost, onDemandCost, etc

I feel that if the 'cost' was a more relatable criteria it would be easier to maintain, manipulate and visualize:

as stated above it could be interpreted as a list of penalty/bonus to situation we want to avoid/prefer
one approach could be to interpret the 'cost' as the time actually lost traveling, and transit leg have some fixed cost associated to the mode of transport, boarding, crowded factor but eventually you can sit and do something that you couldn't do while walking.
one approach could be to interpret the 'cost' as the actual physical effort, and transit leg have some fixed cost associated to the mode of transport, boarding, etc but in general less than for transfer legs
you have the actual monetary cost of the leg

All the above are not taken into consideration by the duration, number of transfers, etc.. and could be captured nicely into one cost-like critera and would be relevant for a user. Also these would be probably more easily explained in 'business' words and easier to justify and fine-tune.

t2gran commented 2 years ago

walkReluctance is what should be used to limit walking and experience show that 10/20 are good values.

This out of date, good values for walkReluctance is in the range 1.0 - 5.0.

Also potentially a non-linear cost for walk could be interesting, it is increasingly frustrating/uncomfortable to walk over time, 1,5,10,15mn transfer would not be ranked by the users as 1,2,3,4 on a scale of "painfulness".

All factors(also cost) must be linear increasing to be valid in Raptor search, exponential factors can not be used and would produce strange results.

t2gran commented 2 years ago

General question of the 'cost'

We plan to replace the existing cost calculation in the heuristics with a Raptor search with cost as a single criteria. This is described in Fast and Exact Public Transit Routing with Restricted Pareto Sets, by Delling ++.

hannesj commented 2 years ago

All factors(also cost) must be linear increasing to be valid in Raptor search, exponential factors can not be used and would produce strange results.

Yes, I was thinking that we could have exponential cost on the transfers, based on the duration, but this might produce strange results, where we ride one bus stop potentially the wrong way, in order to split a long transfer into two

slvlirnoff commented 2 years ago

General question of the 'cost'

We plan to replace the existing cost calculation in the heuristics with a Raptor search with cost as a single criteria. This is described in Fast and Exact Public Transit Routing with Restricted Pareto Sets, by Delling ++.

Thanks, I'll re-read this one!

slvlirnoff commented 2 years ago

All factors(also cost) must be linear increasing to be valid in Raptor search, exponential factors can not be used and would produce strange results.

Yes, I was thinking that we could have exponential cost on the transfers, based on the duration, but this might produce strange results, where we ride one bus stop potentially the wrong way, in order to split a long transfer into two

I see and you are right about exponential costs, however in this case if the results are otherwise similar (similar duration/arrival time) I'd prefer walking twice 10mn instead of once 20mn.

Outside exponential cost you could have fixed additional penalties if you walk more than 10mn. That would probably not be considered exponential as this would be similar to some boarding cost or trips having penalties or preferred costs and could help keeping a small walkReluctance factor and still favour transit routing.