catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Integrate 2013 - present FERC EQR data #2607

Open bendnorman opened 1 year ago

bendnorman commented 1 year ago

What great mysteries does EQR hold? Let's liberate it and find out!

Definition of Done

Achieving level 1 would be a hugely valuable as the data is currently locked away as thousands of csvs in nested zip files. Distributing a parquet file for each entity (identities, contracts and transactions) would be the first public consolidated version of the data.

It would be great to achieve level 2 but there is an uncertain amount of data exploration and cleaning work. Level 3 and 4 will likely require substantially more time and exploration.

This issue can be closed once level 1 is achieved.

- [ ] https://github.com/catalyst-cooperative/pudl-archiver/issues/31
- [ ] https://github.com/catalyst-cooperative/pudl/issues/2608
- [ ] https://github.com/catalyst-cooperative/pudl/issues/2609
zaneselvans commented 1 year ago

I think we'll need to get the datatypes settled in order to write it to parquet files, so we might want to move the data type requirement into Level 1.

zaneselvans commented 8 months ago

Note that FERC is looking to migrate the EQR to XBRL as well. Here's some news about their NOPR.