Options for transit routing

Hussein-Mahfouz commented 1 month ago

Some notes on available tools for transit routing. Related to #2 and #3

The approach in gtfsmulti is interesting. If I understand correctly, it allows you to do multimodal routing without having to do both street network routing and transit network routing. It edits the feed so that you no longer need street network routing for access/egress/transfer trips. From the documentation:

While trying to understand the internals of R5, I got the following idea. If we are already pre-calculating so many street network travel times, can we not simply include them in the transit network timetable, which can then be utilized by any GTFS-consuming routing engine? In the end, the grid points are fixed locations in space, just like transit stops. When we take the R5 approach a step further and also use the grid points as origins of trips, the access leg of a multi-modal route can be seen as a “transfer” from a grid point to a transit stop, and the egress leg as a “transfer” from a transit stop to a grid point. Using the same thought, a direct trip is a “transfer” from one grid point to another. I formalized this idea as an extension to the standard GTFS format: GTFS-Multi. It takes quite some pre-processing to create such a GTFS-Multi feed, but the benefit is that afterwards only schedule-based routing algorithms (i.e. those that scan a transit timetable) are sufficient to find optimal multi-modal routes between an origin and (multiple) destination(s).

gtfsrouter
Worth going through this list by MobllityData
r5r isochrone function which calculates travel time matrices from a specific point using raptor
Raptor in python
- Kuan butts gist and blogpost
- Link to a faster implementation (check comment at bottom): https://gist.github.com/kuanb/a45b65c3135dce717497643e7f35f0ab?permalink_comment_id=3823400#gistcomment-3823400

Hussein-Mahfouz commented 1 month ago

gtfs2nx:

Useful for converting a feed to a network x graph
Probably not useful for routing. nx.single_source_dijkstra_path_length() does do a floodfill, but it won't take number of transfers into account

dabreegster commented 1 month ago

Heh, I actually started playing around with this today over in https://github.com/a-b-street/15m/tree/gtfs. I haven't implemented to A-to-B routing or isochrone yet using PT, but per #3, I don't think it'll be conceptually hard to do. No comment about the performance yet (but for this app, it'll be fine). Reference implementations / ideas helpful, but to use directly in the app without deploying a backend, we need something that can compile to WASM. (Or is directly written in JS or something.) Python libraries might work with pyoxide (Stuart's using this for demoland), but I think they'll need to be pure Python all the way down -- if there are system dependencies on GDAL/PROJ/something, it's likely to be much harder / impossible

Hussein-Mahfouz commented 1 month ago

I played around with Dijkstra and have sort of wrapped my head around binary trees and priority queues (it's a fun learning curve). I'm trying to extend the algorithm by accounting for transfers and adding start and end times from stop_times.txt to the edges (as per your comment in #3). I'll push something once I have it working better.

My attempts may not turn out to be useful or efficient, but I'm enjoying it. If useful, some logic may be useful to port to rust / javascript

Dependancies:

gtfs2nx is currently useful for me as it creates a graph and adds walking transfers between nearby stops, so I don't have to worry about pre-processing the feed. I could go back and re-implement the gtfs pre-processing in pure python once I have the routing working properly.

networkx seems to be pure python (no dependancies listed). In the future, I could just turn the gtfs feed into a networkx graph without gtfs2nx

cmconlan commented 1 month ago

If its useful the code I added during the hackathon pre-processes the GTFS feed such that it can be used in a Dijkstra implementation such as this.

It creates "stop" objects (where a stop relates to a transit stop). Within each object there are nested route objects for each of the routes that serve that stop. You can efficiently search these objects to find out which are the next transit services for each routes that serves the stop and these can form edges for Dijkstra to explore.

The code runs in reasonable time, I ran it for the whole Leeds GTFS data in about 20-30 mins or something. But it can be way more efficient actually as the majority time (from my memory) was computing the transfer nodes e.g. simulating walking between all the stops, but if we use pure Dijkstra for everything we don't need to do this as the transfer nodes are naturally discovered as it fans out in search of POIs.

I can't see my script in the repo - perhaps my pull request failed? Did you get a notification or anything Hussein? I can re-upload.

cmconlan commented 1 month ago

I was also thinking about the problem of dynamic routing using a generalised access costs, and the paradoxes that can occur when we try to do this. It's quite an interesting problem really.

I think for what we're doing now we should consider some basic rules to make routing more realistic if possible, such as a max walking time between transit stops and perhaps a max number of transit changes. I think these basic rules can be implemented without violating Dijkstra search principles.

More broadly, I had an idea about this. There is a subset of the literature on routing which focuses on finding "diverse shortest paths" - these are paths which have similar costs but the actual routes provided are quite diverse from one another. This is normally used in context of alleviating congestion. But I can see an application here. We can't use generalised costs within the actual search procedure, we have to use time. But we can generate a set of diverse routes with similar travel times, and then compute their generalised access cost and select the minimum. In any case it would be interesting to adapt the Dijkstra flood fill to some of these methods. Just putting here to record my thoughts, don't think we should worry about this for now.

Some papers: https://ieeexplore.ieee.org/abstract/document/8107512 https://ojs.aaai.org/index.php/AAAI/article/view/20290

Hussein-Mahfouz commented 1 month ago

I can't see my script in the repo - perhaps my pull request failed? Did you get a notification or anything Hussein? I can re-upload.

I made a PR from your fork so it's in the repo now. I've gone through it quickly and will do so again next time I revisit this, but what you're explaining is what we need as far as I understand

The code runs in reasonable time, I ran it for the whole Leeds GTFS data in about 20-30 mins or something. But it can be way more efficient actually as the majority time (from my memory) was computing the transfer nodes e.g. simulating walking between all the stops, but if we use pure Dijkstra for everything we don't need to do this as the transfer nodes are naturally discovered as it fans out in search of POIs.

I think we still need to create walking edges between stops as a preprocessing step. I'm adding the implementation from gtfs2nx for reference in case we need to speed this up. It was only taking a few seconds for the Leeds gtfs feed

dabreegster commented 1 month ago

I've started a walking + transit pathfinding bit at https://github.com/a-b-street/15m/blob/1cc06b48ce4d2ce17419f84f79a2502b3c4869db/backend/src/transit_route.rs#L78, many issues, but seems to initially do something.

It sounds like it'd be quite helpful if we could glue the efforts you're doing to a web app, so you can see the output easily. At some point, I'll try to make time to play with pyoxide

Hussein-Mahfouz / meet-me-halfway

Options for transit routing #4