singularity-energy / open-grid-emissions

Tools for producing high-quality hourly generation and emissions data for U.S. electric grids
MIT License
67 stars 4 forks source link

Fix function calculating averages of the fuel types #371

Closed rouille closed 5 days ago

rouille commented 6 days ago

Purpose

Fix function that calculates the averages of the fuel types. Closes CAR-4205.

What the code is doing

Add a new row that summarizes all fuel categories as follows:

Testing

Ran the 2005 pipeline

Where to look

Relevant changes are in the the write_generated_averages function. Other edits add typehints and docstrings to function in the oge.output_data module.

Usage Example/Visuals

N/A

Review estimate

5min

Future work

N/A

Checklist

grgmiller commented 5 days ago

@rouille could you please post some example screenshots of the new outputs with this fix just to double check that everything is being calculated correctly?

rouille commented 5 days ago

@rouille could you please post some example screenshots of the new outputs with this fix just to double check that everything is being calculated correctly?

Here it is for 2005. There is the nan row coming from plant 13213 in Mississippi, see plant attributes below

>>> psa = pd.read_csv("plant_static_attributes_2005.csv.zip")
>>> psa[psa["fuel_category"].isna()]
      plant_id_eia       plant_name_eia  capacity_mw plant_primary_fuel fuel_category fuel_category_eia930 state county city ba_code ba_code_physical  latitude  longitude plant_operating_date plant_retirement_date  distribution_flag         timezone data_availability  shaped_plant_id
3389         13213  BTEC New Albany LLC          NaN                NaN           NaN                  NaN    MS    NaN  NaN     NaN              NaN   34.5411   -88.9422                  NaN                   NaN               True  America/Chicago         cems_only              NaN

It is fixed in #368 and will be taken care of once we rebase this PR. Nevertheless the calculation is correct. The data frame being to wide, I drop the file instead of a screenshot. annual_generation_averages_by_fuel_2005.csv

grgmiller commented 5 days ago

The "generated" rate columns are now correct, but we just want to also fix the numerical columns

rouille commented 5 days ago

The "generated" rate columns are now correct, but we just want to also fix the numerical columns

Should we change the file name then, annual_generation_averages_by_fuel_2005.csv is confusing in my opinion. I would expect to find average value for all columns including the absolute ones.

grgmiller commented 5 days ago

Should we change the file name then I think here the averages refers to fuel-average emissions factors, rather than averages across fuels. The total row is a sum (just like each of the fuel rows are a sum of specific plants - we are not calculating the average net generation across all coal plants), and it is the generated_rate columns that are the averages of those sums.

If we change the file name, we would just need to track down where we are using this and change the file name there as well.

rouille commented 5 days ago

The "generated" rate columns are now correct, but we just want to also fix the numerical columns

Done. See file attached annual_generation_averages_by_fuel_2005.csv