This PR merges several months of work to expand the historical coverage of the data back to 2005. All of this work has been previously reviewed.
NOTE: I am proposing that we merge directly into main and then rebase development to match main once v0.5.0 is released.
Draft release notes
Expands historical coverage of OGE to include monthly and annual data for 2005-2018 (#295 and #362)
In addition to the new data, users should expect changes to the existing 2019-2022 data: NOx and SO2 totals may change for some plants, net generation totals may change for some plants, data may change for CHP plants (see the "methodological updates" section for more details)
Inputs:
Updates to use the most recent data version of PUDL (v2024.5.0). This includes a re-release of the 2022 EIA-923 data, which may change some of the 2022 results.
Updates reference tables including the energy_source_groups file, and the utility_name_ba_code_map file (#374), and epa_eia_crosswalk_manual (#372), and emission_factors_for_co2_ch4_n2o (#377)
Outputs
All output files (those in the outputs/ directory are now saved as compressed .csv.zip files instead of .csv files. This reduces the disk space of the outputs folder from approximately 16GB to 2.5GB. (#366)
Expands the data in the plant_static_attributes table to include location data (lat/long, address) and nameplate capacity (#364, #382, #385); commercial operation dates and retirement dates (#367). We also screen for and correct erroneous lat/long data (#368)
Fixes a bug where the "total" values in the outputs/annual_generation_averages_by_fuel file were not being calculated correctly
Methodological updates
When calculating the electric allocation factors for combined heat and power (CHP) plants, we previously were calculating this at the generator level, which was introducing bugs for certain combined cycle units when fuel and generation is reported for different generators at the same subplant. We now calculate this factor at the subplant level (#363)
Fixes several bugs with the gross-to-net generation conversions where anomalous fleet-average ratios were being introduced, and default factors were not being mapped to certain generators. Also fixed a bug where GTN ratios were being calculated where there was missing gross generation or net generation data. (#370, #375, #383)
Updates uncontrolled NOx and SO2 factors to align assumptions with those used by the EIA Electric Power Annual, and to fix a bug where we were adjusting the SO2 values for fluidized bed boilers, even though the control efficiencies are already incorporated into the uncontrolled emission factors (#373). In addition, because fuel sulfur content data is not available pre-2008, we use sulfur content values averaged from 2008-2012 to backfill the missing data. When calculating backstop values for missing values in any year, we now use state-specific values (rather than national-average) to reflect differences in the sulfur contents of fuels being delivered in specific parts of the country (#376)
Other fixes
Remove the option to run the EIA-923 allocation at the plant level. This was an artifact that was no longer used (#361)
Clean up function typehints and continue converting docstrings to Google format
Updates where files are stored and accessed from in s3 (#384)
Purpose
This PR merges several months of work to expand the historical coverage of the data back to 2005. All of this work has been previously reviewed.
NOTE: I am proposing that we merge directly into
main
and then rebase development to match main once v0.5.0 is released.Draft release notes Expands historical coverage of OGE to include monthly and annual data for 2005-2018 (#295 and #362)
In addition to the new data, users should expect changes to the existing 2019-2022 data: NOx and SO2 totals may change for some plants, net generation totals may change for some plants, data may change for CHP plants (see the "methodological updates" section for more details)
Inputs:
energy_source_groups
file, and theutility_name_ba_code_map
file (#374), andepa_eia_crosswalk_manual
(#372), andemission_factors_for_co2_ch4_n2o
(#377)Outputs
outputs/
directory are now saved as compressed.csv.zip
files instead of.csv
files. This reduces the disk space of the outputs folder from approximately 16GB to 2.5GB. (#366)outputs/annual_generation_averages_by_fuel
file were not being calculated correctlyMethodological updates
Other fixes