Closed cczhu closed 4 years ago
PRTCS will now be called countmatch
.
Created a small function to read 15_min_counts_<YEAR>.zip
into pandas tables. Experimented a bit with optimizing timestamp conversion times with Pendulum and CISO8601, and found that pd.read_csv
with infer_datetime_format=True
is as fast or faster than using either package with pd.DataFrame.apply
. Will write up a short notebook about this at some point.
Using infer_datetime_format=True
should speed up zip file reading by >20x.
This is far too large a single issue, so per @aharpalaniTO 's suggestion have split up the remaining work into issues #11 onward.
Given the rather limited number of weeks we have to have some version of the volume model running, our priority must now be to create Python versions of PRTCS, KCOUNT and LSVR. Of these, PRTCS is the only that uses entirely novel algorithms.
To create a Python version of PRTCS, we need to:
STTC_estimate3.m
preprocessing.PTCWEEK.m
andPTCYEAR.m
.nearestneighbour.m
for linking short-term and permanent count stations.main_DoM_new_2012.m
code.