cagov / caldata-mdsa-caltrans-pems

CalData's MDSA project with Caltrans on Performance Measurement System (PeMS) data
https://cagov.github.io/caldata-mdsa-caltrans-pems/
MIT License
7 stars 0 forks source link

Meta Issue: List of PeMS Data QC visulization Task #413

Open mmmiah opened 1 month ago

mmmiah commented 1 month ago

We have created several datasets along with QC data table. However, it is difficult to understand how this QC looks like. Therefore, it is necessary to monitor the data QC through visualization. I am proposing below list of the QC to ensure the high quality data production in PeMS.

  1. A dashboard showing number of daily detectors side by side (may be bar chart) from the metadata, five minutes detector models with and without imputation. The bar value can be absolute number or percentage of difference between the number of detectors. This dashboard can be build dynamically based on last seven days data from current date which will provide temporal data quality assurance over time. This dashboard visualization can be further break down by district, route, county and others meta data catagory to bring spatial data quality understand. Issue #425
  2. Similar to Detector level QC dashboard, station level daily number of detectors for last seven dynamic days can be visualized to ensure data quality in station level. issue #427
  3. Compare the Number of good and Bad (details Category) detectors reported in
    INT_DIAGNOSTICS__REAL_DETECTOR_STATUS over consecutive time. For example, show the number of detectors for each error type over seven days to understand any unusual detector status. issues #428
  4. Only comparing number of detectors or stations over spatial and temporal period does not provide 100% guaranty of good data quality. To ensure good data quality we need to have long term visualize plan of some of the metrics spatially and temporally to understand their pattern and outliers. The below is the subcategory for some of those visualization- a. HOV Vs mainline Performance evaluation in terms of volume, speed and occupancy for a one week recent time series. issue #429 b. Detector/Station level imputed Vs. observed volume, occupancy, VMT and VHT line diagram for a week to understand any unusual observed or imputed value. issue #430 c. Detector/Station Level AADT for last 2/3 years to check if there is abnormal drop of AADT value between two/three consecutive years for a same station or detector. This AADT check will cover the others micro level QC between annual and daily level. If the Bigger picture looks good, then other small pictures should be fine. issue #431 d. Compare the 5mins weekday and weekend volume, speed and occupancy between old PeMS and New PeMS #464

I will add more in later and feel free to add any other ideas of QC visualization if you can remember.

jkarpen commented 1 month ago

@mmmiah will create separate issues for these tasks for tracking purposes.