deployment-gap-model-education-fund / deployment-gap-model

ETL code for the Deployment Gap Model Education Fund
https://www.deploymentgap.fund/
MIT License
6 stars 2 forks source link

Update PUDL ETL to pull new data directly from database #307

Closed bendnorman closed 2 months ago

bendnorman commented 5 months ago

Still getting mysterious error that 'battery_storage_existing_co2e_tonnes_per_year' does not exists in a data mart dataframe. Not sure why a PUDL update would cause this.

TrentonBush commented 5 months ago

PUDL may have renamed the battery storage label or they were filtered out from MCOE? That column is created basically like a pivot table, so if there are no battery storage values maybe the column doesn't get created. I can follow up in this upcoming sprint if you'd like.

bendnorman commented 4 months ago

On dev, battery_storage_existing_co2e_tonnes_per_year only had one non null county. The updated pudl data is all nulls so the column is dropped during this step so I removed it from the list of columns to drop.

With the postgres load improvement and loading PUDL tables from the DB, the full ETL takes 3 min to run on my computer.