CityofToronto / bdit_traffic_prophet

Suite of algorithms for predicting average daily traffic on Toronto streets
GNU General Public License v3.0
1 stars 1 forks source link

Pythonize PRTCS #8

Closed cczhu closed 4 years ago

cczhu commented 4 years ago

Given the rather limited number of weeks we have to have some version of the volume model running, our priority must now be to create Python versions of PRTCS, KCOUNT and LSVR. Of these, PRTCS is the only that uses entirely novel algorithms.

To create a Python version of PRTCS, we need to:

cczhu commented 4 years ago

PRTCS will now be called countmatch.

Created a small function to read 15_min_counts_<YEAR>.zip into pandas tables. Experimented a bit with optimizing timestamp conversion times with Pendulum and CISO8601, and found that pd.read_csv with infer_datetime_format=True is as fast or faster than using either package with pd.DataFrame.apply. Will write up a short notebook about this at some point.

Using infer_datetime_format=True should speed up zip file reading by >20x.

cczhu commented 4 years ago

This is far too large a single issue, so per @aharpalaniTO 's suggestion have split up the remaining work into issues #11 onward.