Open kengodleskidot opened 4 months ago
@mmmiah and I will review the 'int_performance__detector_metrics_agg_five_minutes' model to see how we incorporate checking for high flow values and how to handle this scenario with imputed data. @thehanggit will perform some statistical analysis on the flow data so we can come up with a high flow threshold to incorporate in the logic to be added to the 'int_performance__detector_metrics_agg_five_minutes' model
Marking this paused for now, work will continue after #302 is completed.
@thehanggit is going to talk to @ZhenyuZhu-Caltrans about this one, it may make more sense to move this into the project going on with the AAE team since they may want to use machine learning techniques for this.
When reviewing aggregations of the 30-second (raw) data to 5-minutes we have come across observed flow values that are extremely high (500+ vehicles / lane / 5-minutes). Below is an analysis of the aggregated 5-minute data for detectors in lane 1 over a one-month time period as an example:
I then reviewed the data (Raw and 5-minute) in current PeMS to see how high flow values are being handled and observed the following:
We do not currently have a diagnostic test for mainline and HOV stations to diagnose if high flow values at the detector level should result in a Bad detector status. To help identify issues associated with excessively high flow values I will create a new diagnostic test to check the number of high flow values for mainline and HOV lanes similar to what is being done for the High Occupancy diagnostic test. We will also want to incorporate logic in the final 5-minute aggregate table that looks at the observed flow value and replaces excessively high flow values with either the imputed value or the max capacity value at the lane level.