CityofToronto / bdit_flashcrow

Working repository for MOVE, a project to modernize transportation data systems at the City of Toronto.
MIT License
10 stars 2 forks source link

Multi-day counts show up as separate single-day counts #201

Closed candu closed 4 years ago

candu commented 5 years ago

What Happened When viewing any location with a multi-day ATR count, the count shows up as separate single-day counts, one per day.

What Should Happen ATR counts on consecutive days with the same arterycode should be treated as a single multi-count.

To Reproduce Steps to reproduce the bug:

  1. Go to "View Map";
  2. Select any location for which there is a multi-day ATR count;
  3. In the counts table, hover over the date dropdown.

At this point, you'll see separate counts listed for each day of the multi-day ATR count.

Additional Notes It's unclear what to do for RESCU or other permanent station counts, where locations can have over 10k counts. We may need to continue treating those as individual single-day counts.

This may involve some ETL pipeline work as well, e.g. if we want to provide a MATERIALIZED VIEW that solves this problem. In that case, we'd also need to modify existing pipelines to TRUNCATE.

candu commented 5 years ago

Realized an additional wrinkle here: we need some way to fetch all data for a multi-day count. This also implies reworking our report logic to handle multi-day data, especially where that's intended to be broken down into daily tables.

That probably means having some kind of relation between COUNT_INFO_ID and a "multi-day count ID". However, updating that relation as new counts come in is non-trivial, especially considering multi-day counts in progress.

For this reason, this is a much larger task than originally thought.

candu commented 5 years ago

After further discussion (thanks @aharpalaniTO !) there's a possible way forward in the interim: we ignore RESCU / permanent station counts for now, perform the grouping as previously described (either in application or in database), and return a list of the COUNT_INFO_IDs related to the multi-day count from the REST API.

@aharpalaniTO also pointed out that, if we do create a view of multi-day counts, we could use the MOVE request ID as the "multi-day count ID" (or otherwise link those requests to the multi-day counts).

As discussed, one possible way to build the multi-day count view:

candu commented 4 years ago

Starting spike on multiday count MATERIALIZED VIEW implementation; see PostgreSQL Capabilities on Notion for notes there.