Improving Assign Incomplete Day Count to Clusters Algorithm

Current Algorithm: Calculate a hypothetical daily total by assuming the day to be each cluster centre profile, and add up the deviations from that particular profile. Assign the day/location to the profile with the least deviations.

Locations with full-day counts will take the cluster that is determined by full-day counts to fill-missing-values. They do not go through the process. The locations that go through this process are the ones where only incomplete data is available.

Caveat: when <50% time bins are filled in a day, the shape of the counts could be misleading. Ex. The PM peaking could be lower/higher or comparable to AM peak, but without AM counts, the day will be assigned to the cluster center that has the most similar PM volume profile shape.

Potential improvements could be made on this algorithm by adding in spatial interpolation.

CityofToronto / bdit_volumes

Improving Assign Incomplete Day Count to Clusters Algorithm #39