catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Transform FERC-714 load forecast table #3519

Open zaneselvans opened 3 months ago

zaneselvans commented 3 months ago

Transform the existing FERC-714 raw planning area demand forecast table into a new core asset.

With the development environment set up and the raw_ferc714 asset group materialized, you should be able to access the raw data in a notebook with:

from pudl.etl import defs
from dagster import AssetKey
raw_forecast = defs.load_asset_value(AssetKey("raw_ferc714__yearly_planning_area_forecast_demand"))

Other relevant tables:

The PUDL Data Dictionary contains table and column level documentation.

Why would this be useful?

This table contains the data necessary to reproduce and automate the kind of load growth forecast analysis that RMI highlighted in this 2017 post which would be especially useful in the context of casting doubt on these wild load growth forecasts that many utilities are using to justify building tons of new gas capacity. Yes, utilities are forecasting high load growth. But they're always forecasting high load growth. And they are systematically biased toward predicting load growth that is much higher than actually materializes.

After this is done...

Once the old 2006-2020 DBF/CSV derived data for this table has been integrated, then we can look at potentially pulling in the 2021 and later XBRL data for FERC-714 to extend the timeseries to the present.

seeess1 commented 1 month ago

I'd like to take this one! @catalyst-cooperative/com-dev

zaneselvans commented 1 month ago

Okay @seeess1 I've assigned it to you. Let us know if anything gets weird in there or doesn't make sense! We usually like to make draft PRs early so others can see how changes are evolving and offer help if something seems to be going in an unexpected direction.

zaneselvans commented 4 weeks ago

The best place to look for an analogous integration of a FERC-714 asset is probably pudl.transform.ferc714.out_ferc714__hourly_planning_area_demand() -- and that module is where this demand projection asset should also be defined.

seeess1 commented 1 week ago

@zaneselvans any recommendations on the desired output format that we want for the yearly demand forecast data? Right now I'm just removing some footnote columns, renaming columns, and dropping bad respondents (using the same method utilized by the hourly demand transformation logic). So at the moment the output data just looks like:

respondent_id_ferc714,report_year,plan_year,summer_forecast,winter_forecast,net_energy_forecast
284,2019,2020,559,492,3074969
284,2019,2021,572,544,3175042
284,2019,2022,580,552,3219823

Do we want a different format besides this?

zaneselvans commented 1 week ago

That seems like a natural structure, with a PK of (respondent_id_ferc714, report_year, plan_year)

Looking at pudl.metadata.fields and our naming conventions, for consistency I think we might want to change some of these column names:

zaneselvans commented 1 week ago

The FERC-714 instructions also provide some context for the meaning of the table & columns, which will be useful for populating their descriptions.

seeess1 commented 1 week ago

Ohh! Thank you! Will make these updates and start working on updating the documentation for the new asset.