Open gschivley opened 1 year ago
Forgot to include above. This is using v0.5.0 of the software and database.
❯ conda list pudl
# packages in environment at /opt/miniconda3/envs/powergenome:
#
# Name Version Build Channel
catalystcoop.pudl 0.5.0 pyhd8ed1ab_3 conda-forge
Describe the bug
A number of small plants appear to report all of their generation/fuel consumption for at least some units in a single month of the year (e.g 10805, 50748, 50850, 56356, 57666, and 58161). When calculating unit heat rates via
pudl_out
, the generation and boiler fuel have the functionpudl.helpers.sum_na
applied. This function returnsN/A
when any of the values in a year areN/A
. With all fuel/generation reported in a single month, it isn't possible to get valid data for these plants.Bug Severity
Medium: The units tend to be small and I can generally work around it. But it was a pain to figure out why some units don't have a valid heat rate in any year of the data set.
To Reproduce
Expected behavior
I'm not sure how to best generalize, but if both the generation and boiler fuel are not reported in a month (or in the first 11 months of a year?) then drop them and only keep the valid months of data before aggregating.