Open EwoutH opened 2 weeks ago
Let's describe the problem properly.
Currently, the Google Maps APIs has collected travel times between the population-weighted centroids of each mrdh65 region. This basically went as followes:
This resulted in 20 weighted centroids, between each the travel time was looked up with the Google Maps Distance Matrix API for both cycling and transit travel time (on a weekday morning). This resulted in 19x20x2 = 760 lookups, around ~$4. The travel times can be seen below:
Now, it would be interesting to and/or:
There's a $200 monthly free budget available, which means 40.000 lookups can be done, or 20.000 per mode considering both cycling and transit. A matrix of 141x141 should be possible, which:
The latter seems the best option, since it increases cycling and transit resolution, while not further expanding the scope of the project.
To do this robustly, a pc4 lookup could be added, which is tried first, with a fallback to the existing mrdh65 lookup.
There was an indexing issue on the travel time lookups, that was incorrectly fixed with some aggressive data filtering and an incorrect bugfix. That's now resolved (in b8995158235afab01854e7a9894e2d12ced3c836, aafc5c7e297c822e954587bbd6db5aa31eee2f1c and 483f797831c957895d38ff77ff747776341faf44), so now we're actually using 118 pc4 areas over 21 mrdh65 areas.
That's just under our theoretical limit of 141x141 lookups, but with adding the other areas correctly back we already got a "relative" resolution increase.
A small shift towards bikes and transit can be noticed, notably taking a large part of the av share.
# Before fixing lookups + expanded area
Mode shares: ['car: 15.88%', 'bike: 72.96%', 'transit: 7.93%', 'av: 3.23%']
Hour 7: Mode shares: ['car: 17.08%', 'bike: 70.96%', 'transit: 8.09%', 'av: 3.87%']
Hour 8: Mode shares: ['car: 15.20%', 'bike: 74.10%', 'transit: 7.84%', 'av: 2.86%']
# After
Mode shares: ['car: 11.60%', 'bike: 78.81%', 'transit: 9.35%', 'av: 0.24%']
Hour 7: Mode shares: ['car: 13.05%', 'bike: 77.50%', 'transit: 9.17%', 'av: 0.27%']
Hour 8: Mode shares: ['car: 10.76%', 'bike: 79.57%', 'transit: 9.46%', 'av: 0.22%']
Some initial journey data:
Note how the "car" curves are very smooth, but the transit and bike modes aren't. This is due to this issue, of not having enough spatial resolution to properly estimate distance and travel time from the Google Maps API lookup tables.
So I was considering how to fix this with minimal effort. Here I plotted the travel speed for all connections from the Google Maps lookup tables:
Note how transit is widely spread, while cycle is quite narrow. This maybe allows just assuming a fixed speed for cyclists, since route doesn't seem to matter that much, and use the same car distances we already have in the network.
So for now I see three options:
mrdh65
area codes to pc4
(4 digit postal code) ones. Needs to be done anyways. Double check API prices, test small and don't fuck up the big run.@quaquel bit tired, but I'm going to try to ask an coherent question about this tomorrow. Basically all other stuff is done model wise, now it's experimental design and how to aggerate data nicely. Mode choice is a bit oversimplified but quite happy about everything else. I will mail an detailed update with some proper questions tomorrow.
An old issue where not all nodes had a distance value to each other really became a big factor when moving from mrdh65
to pc4
regions. This was fixed in https://github.com/EwoutH/urban-self-driving-effects/commit/d394ca15f8b733d5bacaacaf4d3f5f010aae6e03.
Edit: 301b48390e760f59c2530bc3d2460f31b7490a48 will further help an easy migration from mrdh65 to pc4.
We're going with option 3.
Expand bicycle and transit lookups are now from a very small 20-area mrdh65 resolution. While for the outer area that's fine, the inner area should have a bit higher resolution to properly be able to say something about a city center. Preferably pc4.
Limit is Google Maps API costs.