catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
470 stars 108 forks source link

Standardize EIA column names for harvesting #1327

Open zaneselvans opened 2 years ago

zaneselvans commented 2 years ago

There are a number of column names which aren't currently standardized across the various EIA spreadsheet maps because they're preemptively dropped during the transform step. We need to stop doing that (see #509) and at the same time give them standard names that are part of the PUDL DB schema so they can be appropriately harvested.

A partial list of columns that need to be fixed. This also includes some other column name standardizations beyond the need to clean up harvested columns. This should probably be done in conjunction with #509 and #1250:

zaneselvans commented 1 year ago

@knordback this issue might be helpful in the context of #509