catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Correct CEMS for net vs. gross generation #245

Open zaneselvans opened 5 years ago

zaneselvans commented 5 years ago

The generation numbers which can be calculated from the EPA CEMS data need to be clearly identified as either net generation or gross generation, and potentially standardized. The EIA 923 generation table (not generation fuel) has information about net vs. gross electricity generation that should be helpful. This is reportedly a big pain in the ass, according to other people who have worked on it.

gschivley commented 5 years ago

I've seen a working paper that adjusted gross generation to an estimate of net and even estimated hourly hydro generation from stream flow data. Can't find it at the moment but I'll check around. It's also the paper that pointed out issues with generation from some combined cycle units (no generation reported to CEMS for the steam turbine IIRC).

grgmiller commented 4 years ago

This has probably been resolved at this point, but based on a conversation with the EPA CAMD folks yesterday, the data in EPA CEMS is gross generation, including all generation before subtracting out house loads.

One of my research goals is to actually try and estimate hourly data about generators not included in CEMS (<25MW) by trying to derive a relationship between gross and net generation and converting net generation data from EIA-923 and EIA-930 to gross generation and emissions.

zaneselvans commented 4 years ago

Definitely not yet resolved! That's why the issue is still open :)

But also definitely something we want to get done. IIRC the 923 data has both net generation (in the generation_eia923 table) and gross generation (in the generation_fuel_eia923 table) on a monthly basis. Though we may have mislabeled the gross generation net generation now that I think about it. Need to look at that table definition more closely. The net generation in the generation table is by generator, but the generation fuel table is only plant level data. But using the ratio between the net and gross generation and the per-generation unit heat rates that the MCOE routines use, it ought to be possible to estimate the net to gross ratois. Butit might be pretty dependent on the capacity factor or duty cycle of the generators. We could make the estimate on a monthly basis maybe. Or we could try and regress out the relationship between capacity factor (or number of startup/shutdown events) and the net-to-gross ratio.

grgmiller commented 4 years ago

From what I've seen in the raw 923 files, it looks like all of the reported MWh data is net generation (or is at least labeled as such in the column headers). The only plant level gross generation data I've been able to find is in CEMS.

zaneselvans commented 4 years ago

Hmm, okay if that's the case then probably you'll need to use the fuel heat content -- it's reported in both CEMS, and in the generation_fuel_eia923 table, broken down into "fuel for electricity" and... other fuel. Which I assume (?) indicates how much fuel is going to parasitic loads, if the generation listed there really is net.

gschivley commented 4 years ago

The EIA923 Schedules_6_7 file has annual gross generation, station use, direct use, incoming electricity, etc. Might be helpful for estimating average plant-level ratios of gross to net generation.

gschivley commented 4 years ago

This NBER paper is also a good read for method ideas https://www.nber.org/papers/w23053.pdf

zaneselvans commented 4 years ago

Aaaaah, that's right there are other files! We're generalizing the spreadsheet extraction process and will map all of the files to get at this data.

grgmiller commented 4 years ago

Reading the user manual for EPA's AVERT tool, they state: "Gross generation [from CAMD] is converted to net generation within the preprocessing engine using unit-specific parasitic loss factors. These factors were calculated based on a comparison of by-plant gross generation [as reported in CAMD] and by-plant net generation [from EIA-923] using 2015 data. Different loss factors are used for coal-fired steam units with and without sulfur controls (8.3% and 6.9%, respectively); natural gasfired combined cycle units (3.3%) and combustion turbines (2.2%); and natural gas- or oil-fired steam units (7.7%). For example, a sulfur-controlled coal steam unit with an annual gross generation of 100 GWh is assumed to export a total of 91.7 GWh to the grid, while a natural gasfired conbined cycle unit with the same gross generation is assumed to export 96.7 GWh."

It seems like here EPA may simply be calculating a ratio between the two numbers, but it would probably make sense to perform a regression that takes into account the weighted capacity factor, although it might be hard to apply any regression to interpolate an hour-specific parasitic loss factor if the hourly capacity factor falls outside the range of monthly-weighted capacity factors in the regression

karldw commented 4 years ago

A non-paywalled version of the Cicala paper is here, with appendix.

grgmiller commented 4 years ago

Interesting. It looks like the data in EIA Schedules 6 and 7 would be quite useful, although it looks like the set of plants in schedule 6/7 (n=5215) is smaller than the set of plants in the other schedules (n=8714). Still this would be a good starting point. I'll take a look at the Cicala paper and his method for what he calls "net-to-gross ratios", and report back about conversion factors

grgmiller commented 4 years ago

While working on this, I just wanted to highlight an observation about the data included in CEMS (which perhaps was already obvious to others looking at this data, but just wanted to be sure to post):

The gross_load_mw data is not generation(mwh), but load (mw), so to get the gross generation numbers to convert to net generation, you first have to multiply gross_load_mw by operating_time_hours to create a new column, which I call gross_generation_mwh. If you look at the data, a generator that has the same gross load in two hours but operates for 0.5 hours in one of those hours will have about half the heat input in that hour.

grgmiller commented 1 year ago

We have now implemented this gross to net generation conversion as part of the open grid emissions initiative.

This uses the code here: https://github.com/singularity-energy/open-grid-emissions/blob/main/src/gross_to_net_generation.py

And is documented here: https://docs.singularity.energy/docs/open-grid-emissions-docs/gross_to_net-background-of-gross-generation-and-net-generation.

There are still some outstanding issues with the methodology as implemented, documented here:

A lot to digest here, but would appreciate any thoughts any of you all have on improving these methods @gschivley @karldw

karldw commented 1 year ago

Congrats on implementing this! It's been a while since I've thought about gross-to-net conversion, so I'm not sure I have anything helpful to add, but your write-up is great.