catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
471 stars 108 forks source link

FERC regressions #204

Closed alanawlsn closed 3 years ago

alanawlsn commented 6 years ago

Implement regressions within FERC O&M dataset that will allow us to attribute fixed vs. variable cost components at the (FERC) plant level.

zaneselvans commented 6 years ago

Also, the scikit-learn python library has a nice page talking about its collection of linear models, with some background on each one.

The end goal of looking at these costs is really to build some kind of model of the cost per MWh on a per plant basis, that depends on... what? Like, if we wanted to predict the marginal cost of electricity for a given plant, what would we need? I think the relevant inputs (independent variables, X) end up being something like:

And the output we're trying to obtain is just the cost of a marginal unit of electricity production ($/MWh).

Especially given how little information we have about what goes into all of those different FERC cost categories, and whether or not the utilities are reporting those cost categories in a really standard way, I wonder if we might have better luck just using these simpler inputs / output in the regression?

These inputs would allow terms that are a function of plant capacity (i.e $/MW installed), as well as capacity factor, and of course also true fixed costs. The regression wouldn't be confounded by fuel cost volatility, which are probably one of the larger sources of variance, since it would have all the information required to get the fuel cost component right (heat rate, fuel price, fuel heat content). We could even just leave in all the different fuels with their different prices and heat contents separated if we wanted to (since some coal plants use a non-trivial fraction of gas or oil).

Does that seem sensible?

zaneselvans commented 6 years ago

Thinking about this some more... don't we really just need to figure out a model for the non-fuel costs since we know pretty much exactly how the fuel costs contribute to the overall cost per MWh? Then we just need to fit the remaining non-fuel costs per MWh to a function of the plant capacity, capacity factor, plant & fuel type, and year/decade of construction. We can leave out the fuel price, heat rate, heat content, etc. since we can assemble that function from scratch, and (I think) have little reason to believe that those variables would have much influence on the other plant costs. Would we want to include an interaction term to find a dependence on (capacity * capacity factor) (aka net generation) in addition to capacity and capacity factor independently? Or are we interested in just the capacity & net generation terms?

michaelpburt commented 6 years ago

Hi Zane, This may not be what you are looking for, but many ISO's publish guidance on what the VOM (variable operations and maintenance) costs are on a per-technology basis (aka supercritical coal, subcritical coal, CT, CC, hydro, etc.). This guidance is used widely as an input in marginal cost models. See page 22 of this PJM manual > https://www.pjm.com/-/media/documents/manuals/archive/m15/m15v28-cost-development-guidelines-10-18-2016.ashx

In my experience, VOM is usually sufficient to encompass all costs beyond fuel input and carbon & MATS compliance related costs. Those costs include things like fly ash, urea, chlorine, or other inputs into scrubbers and such. I am not sure what the magnitude of those costs are, but I bet there is some pretty good documentation out there. Off the top of my head, I think they are around $1-$3 per MWhr for big nasty coal plants.

zaneselvans commented 5 years ago

@michaelpburt I definitely don't completely understand the calculations that PJM is describing in that document but the sense I got was that there's an acceptable VOM number that generators can include in their prices based on the technology of the generator, and that that number may be different from the actual variable expenses they've experienced? Is that right? Is that to compensate for typical expenses that just haven't been experienced by a generator yet? Like how you know the cost of maintenance on a new car isn't $0/mi even if it might look that way for the first few years of operations? Are the expenses small enough (relative to fuel) and/or uniform enough across different plants of a given technology that it's not really worth trying to extract the particular per-plant expenses? Is the effect of expected but as of yet unrealized O&M large enough that these categorical estimates are more useful than real per-plant expenditures?

gschivley commented 5 years ago

@alanawlsn and @zaneselvans Let me know if there's anything I can do to help with the cost calculations.

zaneselvans commented 5 years ago

Okay, I've merged together annualized records from FERC and EIA on the basis of their report_year, plant_id_pudl and primary fuel type, and plotted some of the more interesting values which are available in both datasets (capacity, fuel cost, total heat content of fuel consumed, net generation), as well as some derived values (heat rate, fuel cost per MWh and mmBTU, capacity factor) against each other, separated out for the coal and gas portions of each of the power plants. The results are below.

One thing that I noted: there were only about 1450 records shared between the two datasets, which seems kind of small (this data is for 2009-2017, the years which we have for both of them). Now in retrospect I realize this is probably (yet again) an artifact of the NA values that are common in the EIA data wiping out a bunch of the aggregated values.

Generally it looks better than I expected it would. Thoughts? @alanawlsn @gschivley @cmgosnell

eia_vs_ferc_capacity_mw eia_vs_ferc_net_generation_mwh eia_vs_ferc_total_mmbtu eia_vs_ferc_opex_fuel eia_vs_ferc_capacity_factor eia_vs_ferc_fuel_cost_per_mmbtu eia_vs_ferc_fuel_cost_per_mwh eia_vs_ferc_heat_rate_mmbtu_mwh

gschivley commented 5 years ago

I'm not as familiar with FERC data - who is required to report to them? The plots do show a nice agreement.

cmgosnell commented 3 years ago

closing because this it no longer relevant. we've generally learned that it is difficult to impossible to categorize the specific O&M lines in FERC as fixed and variable O&M. We have been employing NEMS' breakdown of fixed a variable O&M. See example here.