Closed grgmiller closed 2 years ago
In addition to adding BA-level data quality metrics, I also added a notebook that can be used to explore stats for generators that do not report to CEMS and how they might differ. A lot of this work is based on feedback I got during some peer-review of my dissertation chapter.
This PR also fixes several other bugs that were introduced in other PRs that have been merged into the v0.1.2 branch:
shaped_eia_data
df only contains fleet-level data and not subplant data so this validation check won't work. I set the validation parameter back to False when combining the hourly data and added a note about why we are doing that. (Closes https://github.com/singularity-energy/open-grid-emissions/issues/231)output_data.write_plant_metadata()
Previously, all of our data quality metrics only provided information about the quality of data for the entire country. However, the quality of data in individual BAs could vary quite widely.
This PR is a work in progress, but will add BA-level metrics (in addition to the national metrics) to:
By working on this, I also discovered that some plants report data at multiple frequencies (https://github.com/singularity-energy/open-grid-emissions/issues/232), which was causing some of the metrics to not sum to 100% because there was missing frequency codes. This patches that by filling missing frequency codes with "multiple" (as opposed to "annual" or "monthly")