conveyal / r5

Developed to power Conveyal's web-based interface for scenario planning and land-use/transport accessibility analysis, R5 is our routing engine for multimodal (transit/bike/walk/car) networks with a particular focus on public transit
https://conveyal.com/learn
MIT License
272 stars 71 forks source link

Select Link custom modification #924

Open abyrd opened 6 months ago

abyrd commented 6 months ago

This can be used as an experimental worker version with a custom modification of r5type select-link. Adresses #913.

Modification format:

{
  "name": "Select Link SE Hawthorne (14)",
  "r5type": "select-link",
  "lat": 45.512003,
  "lon": -122.642333,
  "radiusMeters": 50
}

This will find all stop-to-stop segments on all routes passing through the area defined by lat/lon/radiusMeters. When a regional analysis is run with freeform point sets and "record paths" enabled, each path is then checked to see if it contains one of these segments, and only the ones passing through the zone are kept.

The selection point in the example modification is near a transit stop, so it grabs one segment of a few patterns in one direction, and two segments (on either side of the stop) in the other direction. This is indicated in the worker logs and returned as scenario application info for display.

2024-01-03 19:12:18,321 [pool-7-thread-1] INFO  c.c.r.a.s.Scenario - Applying modifications to TransportNetwork.
2024-01-03 19:12:18,321 [pool-7-thread-1] INFO  c.c.r.a.s.Scenario - Applying modification of type SelectLink
2024-01-03 19:12:18,909 [pool-7-thread-1] INFO  c.c.r.a.s.SelectLink - Selected links for CSV path output:
2024-01-03 19:12:18,910 [pool-7-thread-1] INFO  c.c.r.a.s.SelectLink - Route 14 direction 1 after stop SE Hawthorne & 27th
2024-01-03 19:12:18,910 [pool-7-thread-1] INFO  c.c.r.a.s.SelectLink - Route 14 direction 1 after stop SE Hawthorne & 27th
2024-01-03 19:12:18,910 [pool-7-thread-1] INFO  c.c.r.a.s.SelectLink - Route 14 direction 0 after stop SE Hawthorne & 20th, SE Hawthorne & 23rd
image

Below is an example of the output CSV. I used a set of 10 randomly placed points as origins and destinations, so there are 100 total origin-destination pairs. The output CSV has one single line per unique O-D pair, and only these 15 O-D pairs have any itineraries passing through the selected area. All the separate paths for that O-D pair are condensed into one (this is done in a self-contained post-processing step, and could conceivably be disabled).

The routes column shows all bus routes used in any itinerary that passes through the selected area. The totalTime column is the average door-to-door time of all itineraries passing through the selected area. The nIterations column is the number of times (out of 120 total) that an itinerary passed through the selected area.

origin,destination,routes,boardStops,alightStops,rideTimes,accessTime,egressTime,transferTime,waitTimes,totalTime,nIterations
ab532a5e,71ec93a1,48|14|100,ALL,ALL,ALL,ALL,ALL,ALL,ALL,77.1,7
57268260,71ec93a1,55|54|14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,62.8,35
57268260,1e77cf67,55|14|72,ALL,ALL,ALL,ALL,ALL,ALL,ALL,91.8,14
a8dcb0fb,71ec93a1,55|54|14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,57.4,38
a8dcb0fb,1e77cf67,55|14|72,ALL,ALL,ALL,ALL,ALL,ALL,ALL,86.3,9
d8932a64,71ec93a1,14|70,ALL,ALL,ALL,ALL,ALL,ALL,ALL,46.2,54
82eaeda4,71ec93a1,14|100,ALL,ALL,ALL,ALL,ALL,ALL,ALL,74.6,12
71ec93a1,396764f7,16|14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,84.3,18
71ec93a1,82eaeda4,57|52|14|100|58,ALL,ALL,ALL,ALL,ALL,ALL,ALL,76.6,105
71ec93a1,57268260,54|14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,66.3,72
71ec93a1,d8bb0bd6,14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,24.5,5
71ec93a1,d8932a64,17|14|70|8|6,ALL,ALL,ALL,ALL,ALL,ALL,ALL,50.8,35
71ec93a1,ab532a5e,90|48|14|100,ALL,ALL,ALL,ALL,ALL,ALL,ALL,79.2,29
71ec93a1,a8dcb0fb,54|14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,60.5,72
d8bb0bd6,71ec93a1,14,ALL,ALL,ALL,ALL,ALL,ALL,ALL,27.0,35

A known issue is that this single iteration count may not allow deducing the proportion of trips on this OD passing through the selected link, because some of the iterations may have no paths (the destination may be unreachable on some of the iterations and not others). One solution would be to include a second line for each OD present, where the routes field is also ALL, as a denominator.

I have only tested on smaller data sets (Portland, Oregon) but it seems to work as expected.

I implemented this as a custom scenario modification type. It doesn't really modify the network so it's a bit odd to think of it as a modification, but it has the following good characteristics: a) Can be used as a custom worker version on the standard production system (without re-deploying the whole system) b) Retains and reuses geographic intersection calculations to make it faster (the link selection step is a bit slow, making it part of the scenario means it's reused) c) Doesn't require rebuilding the transportation network files, can be used on existing networks and scenarios by just adding one modification d) Doesn't change the CSV output format, just removes all rows that do not pass through the selected area

abyrd commented 6 months ago

In response to testing and feedback, the output format has been modified as follows:

origin,destination,routes,boardStops,alightStops,rideTimes,accessTime,egressTime,transferTime,waitTimes,totalTime,nIterations
ab532a5e,71ec93a1,90 (MAX Red Line)|ffa78db2-7175-426f-8a77-5841acf262b3 (add southeast)|48 (48)|14 (14)|38 (38)|10 (10)|100 (MAX Blue Line),ALL,ALL,ALL,ALL,ALL,ALL,ALL,82.0,0.100
ab532a5e,71ec93a1,48|100|14,1196|9969|3635,11939|8336|2615,7.3|18.0|9.5,15.9,4.9,6.3,1.9|5.3|5.1,74.1,0.058
ab532a5e,71ec93a1,48|90|38|ffa78db2-7175-426f-8a77-5841acf262b3,1196|9969|1108|[new],11939|10118|12795|[new],7.3|8.9|3.2|10.6,15.9,2.4,24.4,1.9|3.6|1.9|1.7,81.8,0.033
ab532a5e,71ec93a1,48|100|10|ffa78db2-7175-426f-8a77-5841acf262b3,1196|9969|3635|[new],11939|8336|2594|[new],6.6|18.0|3.7|8.4,15.9,2.4,8.3,6.6|2.6|4.7|1.0,78.0,0.008
82eaeda4,71ec93a1,57 (57)|ffa78db2-7175-426f-8a77-5841acf262b3 (add southeast)|52 (52)|90 (MAX Red Line)|45 (45)|14 (14)|10 (10)|100 (MAX Blue Line)|6 (6),ALL,ALL,ALL,ALL,ALL,ALL,ALL,77.1,0.146
82eaeda4,71ec93a1,100|14,9826|3635,8336|2615,27.7|9.5,15.4,4.9,4.6,1.9|5.1,69.1,0.100
82eaeda4,71ec93a1,57|90|45|ffa78db2-7175-426f-8a77-5841acf262b3,5600|9821|12792|[new],9654|10118|12795|[new],12.1|14.4|5.5|10.6,6.3,2.4,16.2,3.6|5.8|2.5|1.1,80.6,0.008
82eaeda4,71ec93a1,57|90|6|ffa78db2-7175-426f-8a77-5841acf262b3,5600|9821|1114|[new],9654|10118|2641|[new],13.4|13.9|10.6|10.6,6.3,2.4,6.6,1.3|6.3|7.0|3.4,81.9,0.017
82eaeda4,71ec93a1,100|10|ffa78db2-7175-426f-8a77-5841acf262b3,9826|3635|[new],8336|2594|[new],27.7|3.7|8.4,15.4,2.4,6.6,7.2|4.7|1.5,77.4,0.004
82eaeda4,71ec93a1,52|90|45|ffa78db2-7175-426f-8a77-5841acf262b3,1669|9821|12792|[new],9985|10118|12795|[new],8.5|14.4|5.5|10.6,11.2,2.4,18.2,1.3|5.8|2.5|1.5,82.0,0.008
82eaeda4,71ec93a1,57|90|6|ffa78db2-7175-426f-8a77-5841acf262b3,5600|9821|1114|[new],9654|10118|2170|[new],13.4|13.9|12.1|8.4,6.3,2.4,6.0,2.3|6.3|7.0|1.9,80.0,0.004
82eaeda4,71ec93a1,57|90|10|ffa78db2-7175-426f-8a77-5841acf262b3,5600|9821|3635|[new],9654|8336|2594|[new],14.7|23.5|3.7|8.4,6.3,2.4,7.5,2.0|5.8|6.7|2.1,83.1,0.004

There is still one summary line (indicated by ALL in several columns) for each unique O-D pair that has paths passing through the select link. But now, below that row are the rows that were combined to create it. In the summary rows, the lists of all routes now include the route name, not just the ID, as routes created by scenario modifications have random IDs. Perhaps most importantly, the iterations column is now expressed as a decimal proportion out of the total number of iterations that reached the destination from the origin. This resolves one known issue mentioned above. It would probably be preferable to show this number of iterations that successfully reached the destination, or to show the proportion of total iterations that reached or did not reach the destination. But this did not fit into the existing columns, and changing the column headings or column count will requirer greater changes to the backend. We can potentially make those changes later, for now this is just a custom worker that must adhere to the existing backend expectations.

abyrd commented 3 months ago

Next step: update this PR to place the selected-link label introduced in 3aaafb9 in the group column introduced in #936.