Model validity (base case)

EwoutH commented 1 month ago

There are a few "problems" with the model that ideally should be addressed. Either this happens in this research, or is documented properly as a limitation.

1. Mode choice behavior

Model behavior

Currently the mode choice distribution - in the base case, so without any AVs - looks like this:

journeys_data_base

Which shows that transit is preferred for the longer distances, while cycling is from the shorter ones.

And the numbers are as follows.

Area	Metric	Bike	Car	Transit
All 21 areas	Mode choice	82.33%	11.20%	6.48%
All 21 areas	Distance weighted	78.77%	10.32%	10.91%
Inner center	Mode choice	86.62%	11.50%	1.88%
Inner center	Distance weighted	87.67%	10.68%	1.65%

Note: The inner center includes the districts Noord, Kralingen, Rotterdam Centrum, Feyenoord, and Delfshaven. The 21 areas the whole area in the city polygon.

Validation data

As for validation data, we know the following:

In V-MRDH 3.0 data, the mode choice distribution should be as follows:

Area	Car	Bicycle	Public Transport
All 21 areas	37.71%	48.99%	13.30%
Inner center	13.40%	69.92%	16.68%

Based on ODiN 2023 (2022 looks the same):

Alignment:

Bike (and foot) dominate shorter travel times, car the middle travel time, public transport the longer durations
Bike (and foot) dominate shorter travel distances, cars and public transport the longer distances

Non-alignment:

In general, public transport and car use are on the low side, while bicycle use is quite high.
There are only small differences in mode shares between the larger Rotterdam Area and the inner center, while there are far larger ones in the MRDH data.

Potential solutions

Add some penalty to bicycles a. Increase the VoT b. Add a fixed penalty for each trip c. Add a distance penalty (possibly non-linear)
(non-feasible) build a more extensive mode choice model based on Stated Preference data.

2. Lack of actual traffic jams

In general, the average speed stays very high, indicating a lack of traffic jams in the model (bottom left plot).

uxsim_data_base

This could be caused by either/or:

The lack of car as a chosen modality
The lack of cars from outside areas
Road capacity input values (model calibration)

It's difficult to find data on how slow traffic should be going in peak traffic, but certainly faster than this. TomTom Move might have this data (available).

Potential solutions:

Add a fixed amount of outside traffic
Lower road capacity to compensate for the lack of outside traffic
(not feasible) simulate a larger area. Both OD API lookups and the network are insufficient for that.

Synthesis

The lack of traffic jams will worsen the lack of cars being chosen as mode, so this is a problem where there's interaction, so 2. should probably be solved first.
A semi-accurate traffic pressure is needed to properly study the effects from ingesting

EwoutH commented 1 month ago

Lack of actual traffic jams

Added traffic from outside the city in 3a951f16a01c401b84ae9f5ab3b52728613055d1, 4dfb7a360ed0bf7cd49a5ce7b4e448a7e7f40161 and 1639c64a97bd17f72b16b199a422a48669f1187e. There's definitely traffic now. This is with ext_vehicle_load=0.75:

uxsim_data_base

What we're basically doing is assuming the amount of current traffic between the red and blue areas, as based on V-MRDH 3.0, distributed over the day based on ODiN.

internal_external_areas

With ext_vehicle_load=1 the whole city basically comes to a massive gridlock. We don't include all the small roads, which also have some capacity, and there probably some inaccuracies in the density values and the simulation itself.

0.75 is close enough to see some real effects.

The big problem is the internal mode choice now:

Mode	Bike	Car	Public Transport
Reference data	48.99%	37.71%	13.30%
Current model data	85.40%	7.80%	6.80%

So let's introduce a "comfort factor" with some negative values for bikes and positive for cars. It applies directly on the perceived costs.

CC @quaquel

quaquel commented 1 month ago

This looks good.

Now a slightly evil question that you should not think about too long: how plausible is this 0.75 number?

EwoutH commented 1 month ago

Implementing a comfort factor was completely trivial (428b2fb61d737a45bb4156af1669935633e7b1c5). Wasn't that happy about the state of my codebase, but this is a nice testament to it.

Simulation time is though the roof with that many cars.

Now a slightly evil question that you should not think about too long: how plausible is this 0.75 number?

Ï think the better question are:

How accurate is UXsim as a real-world traffic?
How accurate are the road density numbers defined here?
How accurate is the road network which was extracted with OSMnx?
What's the effect of not including smaller streets, and only OSM tertiary and larger (docs)?

The fact that 0.75 (now also testing 0.6) is within the right order of magnitude (and even withing a 2x range) I'm actually surprised about.

quaquel commented 1 month ago

So, how should I read this number: does it mean that 76% of the traffic comes from outside the red area?

EwoutH commented 1 month ago

No, sorry it's something else: It's just a scalar that scales the numbers from the existing OD-matrices by some factor. So 1 means exactly those numbers. It's a multiplier for external traffic.

The ratio between internal and external traffic is an interesting number to keep in check, since internal is adapting but external is static.

EwoutH commented 1 month ago

With an car_comfort=0.5, bike_comfort=1.2 we see some better balances, more car and transit use, and serious traffic jams:

mode_distribution_base

journeys_data_base

uxsim_data_base

parked_data_base

Let's throw in some AVs tomorrow.

EwoutH commented 1 month ago

One small issue I still had is that we didn't allow trips within the same MRDH65 area. Within some areas, this number is really big. This caused a major overrepresentation in trips to areas further away, increasing the average distance and travel time spend.

This had two effects on model validity:

Mode choice is biased towards modalities better fitted for longer and further away modes.
The networks in strained out of proportionally large, because trips cover a longer distance, including cars/avs over the network.

The resolution bump for bike/transit travel time and distance lookups from mrdh65 to pc4 resolution in 96f826e95947cc47ed3319be771cc57ecaba3def allowed for trips within mrdh65 regions to take place, as long as there is more than one PC4 area in that MRDH65 (there isn't a travel time to travel to your own PC4 area. That was the case for all MRDH65 regions except three:

MRDH region 11 (IJsselmonde (bui)) only contains one pc4: [3077]
MRDH region 12 (Hoek van Holland) only contains one pc4: [3151]
MRDH region 14 (Rozenburg) only contains one pc4: [3181]

0a501be5e9211f395743bb82854f3ee2f9ff976a updated the destination lookups to only exclude identical MRDH65 regions if they had only one PC4 region, instead of all identical MRDH65 regions. So this now allows trips to take place within the remaining 18 MRDH65 regions, which is often done: ~202 of ~378 thousand trips are taken within the same mrdh65 region (~53.5%).

This corrects the bias described above, and makes the model more accurate and representative for how people actually travel.

@quaquel coming back to your question on ext_vehicle_load factor being lower than 1.0 (now 0.6), since this will reduce at least the distance cars within the network travel, and maybe also affect the mode choices, this might allow raising the ext_vehicle_load factor to 1.0 or at least closer to it. Points about the network being from OSMnx data and junctions not being represented properly still stand.

EwoutH commented 1 month ago

This is a major shift in behavior, as expected:

First of all, it can be seen that journeys cover shorter distances and also take way less time: In the bottom row (with the new behavior) the graphs are skewed way more to the left. Cost and perceived cost follow the same patterns.

journeys_data_own_area

First thing to notice is a significant decrease in transit rides. Which is plausible, since those are often more suitable for bike or car rides. Bike use has increased in general, but since the network is less busy, the dip in car used around peak times disappeared.

mode_distribution_own_area

The travel speed remains high and vehicles remaining in each area simultaneously is lower.

uxsim_data_own_area

The per-area heatmaps also show that areas like Rotterdam Centrum and Rotterdam Alexander, which were at peak congestion, are now a lot less congested:

uxsim_heatmaps_own_area