cagov / caldata-mdsa-caltrans-pems

CalData's MDSA project with Caltrans on Performance Measurement System (PeMS) data
https://cagov.github.io/caldata-mdsa-caltrans-pems/
MIT License
7 stars 0 forks source link

Meta Issue: QA/QC Checks on performance metrics #393

Open jkarpen opened 2 months ago

jkarpen commented 2 months ago

As part of the data quality measures implemented for this project, we have developed SQL code that performs row counts and other data quality checks to make sure the correct number of records are included in the various models and that the values in the models meet various data quality measures. While a number of tests are implemented in the yml files, we have also developed SQL worksheets in Snowflake that perform checks not included in the yml files. Currently the worksheets have been developed for the intermediate diagnostic, clearinghouse, imputation and performance schemas to perform these data quality checks.

The results of these data quality checks can be used to implement additional tests in the yml files as well as check the results against other data sources. Below are some tests that need to be implemented and verified, these issues will be tracked as their own issues with additional details as needed:

jkarpen commented 2 months ago

Putting in Sprint 2024-19 for now but may get pushed to 2024-20 depending on how long the speed calculation takes to complete.