Urban-Analytics-Technology-Platform / acbm

activity-based modelling pipeline (for transport demand models)
https://hackmd.io/w-m_OKaDT3GGBfSqFPpBjA
Apache License 2.0
4 stars 1 forks source link

Travel time matrices for assigning activities to zones #20

Open Hussein-Mahfouz opened 2 months ago

Hussein-Mahfouz commented 2 months ago

The NTS data we are using only assigns individual activities to regions (e.g. "West Yorkshire", "North West"). We only have the home location (from the SPC) but we need to be able to determine feasible activity locations.

For example, to determine the location of an education facility, we use home location (spc), mode of travel (nts), reported travel time (nts), and travel time matrices by mode to identify which zones the education facility could be in. (current function here)

I am using travel time matrices (at OA level) that I have calculated from another project, but it would be useful to have a pipeline to create these matrices for any study area.

Related to https://github.com/alan-turing-institute/uatk-admin/issues/9

dabreegster commented 2 months ago

Some questions...

dabreegster commented 2 months ago

If you're using travel time matrices, there are about 180k OAs, so that's around 32 billion entries per mode. Very conservatively assuming 8 bytes per entry, that's around 241GB for one mode's matrix. Seems quite extreme and wasteful, given so many OAs don't interact.

How high does travel time usually go -- not often over 2 hours, hopefully? Do you want to find all destinations within 2 hours of a start point, or stop when you find the closest one, or make some randomized decision about whether to keep searching as you encounter each one? Or since the travel time is from a survey, maybe ignore destinations closer and really insist on some that're about X minutes way?

Hussein-Mahfouz commented 2 months ago
  • What's the performance for the current travel time matrices approach look like, either for building the matrices or querying them? Does either feel like a limiting factor?

I am using r5r and I am building a matrix for a specific city. In my case it was Leeds (~2600 OAs) and I was creating a matrix for each of car, walk, cycle, and 5 matrices for bus (morning_wkday, afternoon_wkday, evening_wkday, night_wkday, morning_wkend, night_wkend). I'm doing this on my laptop. It is very fast for PT (maybe 30 seconds per matrix) but very slow for car trips (could take 30 minutes for the same matrix). I think the difference in performance is because r5 was built for pt routing. The routing engine takes another 30 seconds to start running

The code for the routing wrappers is here and the code for running r5r is here

  • How detailed do modes of travel get -- just "cycling" or "cycling with e-bike so hills don't matter, and confident about stressful roads"? For PT, limits on money for tickets?

These are all the options you can pass (r5r::travel_time_matrix()). For hills, you can add an elevation file in the setup. If it's an ebike, you can ignore the elevation and/or change the bike_speed parameter. For PT, you can add monetary limits through max_fare, but I haven't done that since you would need to add a fare_structure file to your gtfs feed. See this vignette for more details

  • What's your source of network and destination data right now -- both OSM?
Hussein-Mahfouz commented 2 months ago

If you're using travel time matrices, there are about 180k OAs, so that's around 32 billion entries per mode. Very conservatively assuming 8 bytes per entry, that's around 241GB for one mode's matrix. Seems quite extreme and wasteful, given so many OAs don't interact.

Yeah I'm definitely not running this on a national level. I'm currently constraining it to OAs within a specific city, and limiting the travel time to 2 hours.

How high does travel time usually go -- not often over 2 hours, hopefully?

I need to check the NTS to see the travel time distribution. One option could be a design decision to only include intracity trips, and limit the time to 2 hours

Do you want to find all destinations within 2 hours of a start point, or stop when you find the closest one, or make some randomized decision about whether to keep searching as you encounter each one? Or since the travel time is from a survey, maybe ignore destinations closer and really insist on some that're about X minutes way?

There are normally a bunch of different people in each OA, and each one will have a different travel distance from the NTS, so I don't think we could insist on a travel time in the routing phase. It makes sense to me to create the matrix, and for each individual, use the matrix to determine the zones they can reach given the specified travel time from the NTS. This is what I was doing in this function