catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 106 forks source link

Fix EIA year constraints #197

Closed cmgosnell closed 5 years ago

cmgosnell commented 5 years ago

You have to ingest a year before 2012 and 2016 because there are columns that show up in those years that don't exist in other years that are manipulated later in the process. The df created by reading in the excel file only contains columns in the column maps. We need to force every read dataframe to have the same set of columns. They are already siting in the column headers in the column maps.

zaneselvans commented 5 years ago

I am taking this on because fixing it will simplify and speed up our post-ETL Travis testing, which I am creating in connection with integrating the FERC Plant IDs into the overall system.