singularity-energy / open-grid-emissions

Tools for producing high-quality hourly generation and emissions data for U.S. electric grids
MIT License
72 stars 5 forks source link

Use physics-cleaned 930 data in `impute_missing_hourly_profiles` #104

Open gailin-p opened 2 years ago

gailin-p commented 2 years ago

Currently, impute_hourly_profiles.impute_missing_hourly_profiles uses EIA-930 balance files, which don't have physics-based cleaning. We should use the cleanest available data instead.

grgmiller commented 2 years ago

Where is this happening? I looks like that function is only using the residual profiles, which I believe are being calculated based on the physics-reconciled 930 data.

We do load raw 930 data to identify directly interconnected BAs, but we are only using the metadata from these files, not any columns that would have been cleaned.

grgmiller commented 2 years ago

One thing that would be helpful is if we could update the file names from the EIA-930 process to be more descriptive. For example, I would assume that EBA_raw is the same as the balance files, but I think some cleaning has taken place. It is also not clear just by reading the file names which file is the physics-reconciled data. I am also not sure what EBA means - might be better to rename eia930.

Also, it looks like currently the output files are being saved to downloads/eia930/chalendar, but it might make more sense to have these saved to outputs/

gailin-p commented 2 years ago

The raw balance files are used in load_diba_data, called in impute_missing_hourly_profiles. The cleaned data includes interchange, and may be higher quality.

On naming: I'll fix the prefix (EBA) and location, but the suffixes (raw, elec, etc) are hardcoded in gridemissions, so are a bit more inconvenient to change and I think not a high priority.

grgmiller commented 2 years ago

Makes sense - maybe we can add in our code documentation / data dictionary a description of what each of these output files are