catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
471 stars 108 forks source link

Make sure new XBRL extractor works with FERC explosion #2852

Closed jdangerx closed 1 year ago

jdangerx commented 1 year ago

There are a few changes for the ETL to handle the new, improved 2021 FERC 1 data - see #2810 for details.

However, there is at least one change that only applies to the FERC explosion branch - namely, with the new FERC data, we see real live facts which are totals across multiple dimensions.

@cmgosnell and I found a bug where our "make the implicit sums defined by total values explicit" wasn't quite doing what we wanted - we were getting duplicate rows, one for each total.

Concretely, that looks like:

A fact that has total for both utility_type and plant_status has one version with all the utility types as components, and one version with all the plant statuses as components.

We need to enshrine the correct behavior in tests and fix the bug.

e-belfer commented 1 year ago

Done, rest of issues addressed in #2810 now that explode_ferc1 merged into dev.