catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Filled vs. unfilled pudl_out.gen_eia923() have different columns #860

Closed zaneselvans closed 2 years ago

zaneselvans commented 3 years ago

We have the option of filling in missing net_generation_mwh values in the pudl_out.gen_eia923() method, which is enabled by setting fill_net_get=True in the creation of the pudl_out object.

The shape / content of the dataframe which is returned by the pudl_out.gen_eia923() method is currently different, depending on whether the filling is enabled. If the idea is to provide two versions of the same output, one that's the data as reported, and another that has missing values filled in, the output from both should have the same form so they can be used interchangeably.

Here's what they look like right now:

Unfilled:

Filled:

cmgosnell commented 3 years ago

haha i forgot a gen = here.

But I didn't know what to do about the fuel_consumed_mmbtu column. It is useful to have this allocated gen-based consumption but it normally lives in the generation fuel table.

cmgosnell commented 2 years ago

done