catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Expand ID Mapping to data from 2006-2016 for FERC1 & EIA923 #26

Closed zaneselvans closed 6 years ago

zaneselvans commented 7 years ago

We can pull in FERC Form 1 data from 2004 onward. However, the id mapping "glue" tables that we have right now only encompass information from the 2015 FERC database. If we want to be able to work with multiple years of FERC data in the PUDL database, we need glue for those other years. Thus, the id mapping exercise needs to be expanded to pull in new associations between plants & utilities from prior FERC years.

If @cmgosnell & @swinter2011 can indicate which fields they need from the various plants tables to do the mapping, @zaneselvans can pull them from the 2004-2015 Form 1 DB for matching. If we can come up with a common set of columns for all the different types of plants, we could do it all in one go maybe?

swinter2011 commented 7 years ago

Here's what I've got for this - @cmgosnell can you confirm?

Large plants:

Small plants:

Hydro and pumped storage:

zaneselvans commented 7 years ago

But you're going to need the respondent_id so that it can be in the output, it's part of the primary key for plants, and is the primary key for the ferc1 utilities.

cmgosnell commented 7 years ago

From FERC for Utilities: f1_respondent_id, respondent_name

From FERC for Plants: respondent_id, respondent_name , plant_name, report_year, (optional) kind_of_fuel, (optional) capacity_rating, (optional) plant_kind, (optional) tot_capacity

zaneselvans commented 7 years ago

This task is intimately linked with the EIA923 mapping, and we need the full list of plants and their mappings across all the years in order to be able to pull in and marry the two datasets... so I've updated the title and tags to reflect that.

zaneselvans commented 7 years ago

@cmgosnell @swinter2011 before the mapping for 2016 is finished we should pull the most recent 2016 923 data. I just downloaded it and it was last updated on Feb 23rd. There's also a few months of 2017 data, but maybe let's save that for v0.2, unless y'all think it would be really easy to integrate. What all will it take to bring a new year of data in? The most recent FERC Form 1 is still 2015.

zaneselvans commented 6 years ago

We decided to just use 2009-2016, since that's what works for EIA923. Done!