CityofToronto / bdit_traffic_prophet

Suite of algorithms for predicting average daily traffic on Toronto streets
GNU General Public License v3.0
1 stars 1 forks source link

Minimum mean-square error "fitter" for count association #14

Closed cczhu closed 4 years ago

cczhu commented 5 years ago

The final step to PRTCS is to determine the closest permanent count location DoMADT pattern for each short term count location. The steps to Pythonize this in CountMatch are:

For now, we will only associate one short term count location with one permanent count location. Some means of creating a weighted average might be worth investigating (though might also produce degenerate results, with multiple weighting solutions producing roughly the same (possibly bad) minimum MSE). We're also allowing traffic from either direction to be matched.

cczhu commented 4 years ago

A major concern of mine - the median distance between an STTC and its nearest PTC is 2.27 km, and the mean 2.52 km. The median distance to the second nearest PTC is 2.82 km, and the mean is 3.24 km. If correlation between roadways drops with distance, this could greatly reduce the predictive accuracy of CountMatch. (Toronto is roughly 20 x 40 km in size.)

Not to mention most of our PTCs are on highways.

The solution is to find new sources of data, to be discussed further in #19. This post just stresses the importance of that project.

Distribution of distance between STTC and nearest PTC:

image

cczhu commented 4 years ago

Created a lit_review branch to include raw notes of papers. Recorded some notes on the minimum MSE method for assigning PTCs to STTCs here.

(Didn't do this in the Wiki or Issues because of a lack of MathTeX support.)

cczhu commented 4 years ago

In DoMSTTC.m, Arman averages growth rates over all of Toronto (to get a year-on-year multiplicative factor of ~1.02). This seems like we're losing a lot of spatial resolution. Created an issue (#25) to continue recording my concerns.

cczhu commented 4 years ago

Another issue detailed in the known issues:

95: this improperly sums up all short term counts for the location and year, regardless of whether it's the correct day of week. Meanwhile, base_year is the year we're interested in calculating annual patterns, and sel_year is the closest year we have PTC data (year_ttc is the year we have STTC data). Preliminary AADT is calculated using GR_STTC^(base_year-sel_year), but it would be more reasonable to use GR_STTC^(base_year-ttc_year) since it's the absolute value counts from 2006 that need to be scaled by the multi-year growth rate, not the day-to-year pattern. Not sure if this is truly a bug or a deliberate choice I disagree with.

cczhu commented 4 years ago

MVP on fitter is working. Here's an output plot of predicted AADT at all STTCs in 2018:

image

An interactive map can be found in MatcherDev.ipynb, though it sadly doesn't work online.

Outstanding issues:

cczhu commented 4 years ago

Attempted to measure error in CountMatch AADT predictions with TEPS's ground truth. Found >20% fractional errors, which led to an investigation detailed in CountMatchDev2-ReproducingArmanMAE.ipynb. Conclusions:

cczhu commented 4 years ago

While productionizing the fitter, made the observation that the way we estimate the MADT to AADT ratio for short-term counts is:

If we imagine a situation where we have an STTC with only one month j worth of data, and an associated PTC with only one year's worth, we're stuck in a situation where DoM_dij / D_dij (which, recall, are estimated from the PTC) is just the PTC MADT / AADT ratio for month j. Thus, when we calculate MADT / AADT, it will exactly equal the PTC ratio, giving zero MSE (the full derivation is in CountmatchDev3-SensibleMatcherPrototype.ipynb). This breaks the minimum MSE, since it's not unrealistic for several nearby PTCs to only have one year's worth of data. The algorithm does fail somewhat gracefully, though, since it will pick the closest PTC that has zero error, rather than some random PTC. Also note that this is not a bug, but a limitation of how the estimation process works - it occurs because there's far too little data for comparing normalized monthly patterns.

Two ways to resolve this:

cczhu commented 4 years ago

We now have functionalized versions of all the CountMatch algorithms! I created:

The last three algorithms all allow overriding the PTC growth factor (as in Bagheri) with the citywide PTC average growth factor (as in TEPs). Doing this reduces the predictive accuracy of the model under ideal circumstances but should improve accuracy when PTC data is too sparse for proper growth rates to be calculated (which I suspect is why Arman averages the PTC growth rates in TEPs).

To validate, I'm generating fake STTC data using the PTC stations as in Bagheri et al. For each PTC station, I draw a random STTC station, and use the months and years it has data to select a small subset of PTC data. This allows me to reproduce the annual and sub-annual pattern of STTC counts when generating fake data. Bagheri suggests generating 100+ iterations, but I'm impatient and only generated 10.

Full results of the algorithm shootout in CountmatchDev6-Shootout. Summary:

Sensitivity Test

This test checks how much estimates differ between different draws of fake data. An excellent predictive algorithm should minimize this variation.

We can measure this variation by determining the standard deviation divided by the mean, i.e. the coefficient of variation (COV), for each short term count location. We'll take 2018 as a representative year.

Shootout_COV

The TEPs method has the least COV variation, as measured by the median. The countmatch default method is the worst. The reason for this is largely because of using individual station growth rates - if we force the use of the global growth rate, we can reduce the variation down to near TEPs level.

Check Against Ground Truth

We need to examine how accurate the models are, but we can only do that when predicting for sites where we know the AADT, and which years it is known varies from station to station. I therefore rigged up a version of the fake data generator that cycles through the years, prediction AADTs for each station where a ground truth value is known.

This is computationally intensive, so only 10 sets of fake data are generated per experiment.

Here are the results for (CountMatch using the) TEPs algorithm:

Shootout_PredvsGrd_TEPs

for CountMatch:

Shootout_PredvsGrd_CM

and for CountMatch with global growth rate

Shootout_PredvsGrd_CMGGF

All plots have a common scale, which eliminates the extreme outliers produced by CountMatch without a global growth rate.

Observations:

For a more thorough analysis, see the ipynb.

cczhu commented 4 years ago

To do:

cczhu commented 4 years ago

Currently testing the newest CountMatch PR, and making predictions for one year using the entire FLOW database takes around 15 minutes on a core i5 workstation. That's pretty good!