catalyst-cooperative / pudl

The Public Utility Data Liberation Project provides analysis-ready energy system data to climate advocates, researchers, policymakers, and journalists.
https://catalyst.coop/pudl
MIT License
456 stars 105 forks source link

Integrate GridPath RA Toolkit hourly renewable generation profiles #3467

Closed zaneselvans closed 4 days ago

zaneselvans commented 3 months ago
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3490
- [ ] https://github.com/catalyst-cooperative/pudl-archiver/issues/296
- [ ] https://github.com/catalyst-cooperative/pudl/pull/3489
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3523
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3509
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3510
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3511
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3515
- [x] Visualize wind and solar plots to sanity check
- [x] Add GridPath RA Toolkit ETL Settings
- [x] Add `gridpathratoolkit` data source page to docs
- [ ] https://github.com/catalyst-cooperative/pudl/pull/3514
- [ ] Revise issue notes below
- [ ] https://github.com/catalyst-cooperative/pudl/issues/3508

Design Considerations

Wind Profiles

Solar Profiles

Overall

Questions

Notes from README

Appendices refer to the GridPath RA Toolkit report

Hourly Wind Profiles

HourlyWind_byProject.zip: contains hourly simulated wind capacity factor data by project between 2007 and 2014, based on wind speed data from NREL's Wind Toolkit and empirically-derived power curves. Each file corresponds to a project from EIA Form 860: [Plant ID]_capfactor.csv. Note that the hour ending or "HE" time stamp column is missing, but the 24 hours of data corresponding to each day represents HE 1 through HE 24 of that day in Pacific Standard Time. For more information about how this data was developed and used in the study, see Appendix A.4.

Hourly Solar Profiles

HourlySolar_byProject.zip: contains hourly simulated solar capacity factor data by project between 1998 and 2019, based on data from the NSRDB and NREL's SAM model. Each file corresponds to a project from EIA Form 860: [Plant ID]_[Generator ID].csv. Timestamps are in UTC. For more information about how this data was developed and used in the study, see Appendix A.5.

Weather Data

DailyWeatherData_cleaned.csv: daily weather data from 16 locations in the West between 1948 and 2021. For more information, see Appendix E of the report.

Hydro Data

MonthlyHydro_byPlant.csv: monthly hydro energy by plant from EIA Form 923/906 between 2001 and 2020, listed by EIA Plant ID and EIA Plant Name. For more information about how this data was used in the study, see Appendix A.3.

Hourly Load Profiles

HourlyLoad_FERC714_cleaned.zip: contains hourly load data between 2006 and 2020 from FERC Form 714, which was used to develop the load shapes in the Western RA Case Study. Each file corresponds to a FERC respondent. In each file, the columns are: year, month, day, hour ending (Pacific Standard Time), load (MW). This data has been cleaned for use in this study, including making manual adjustments for missing or bad data. For more information about how this data was used in the study, see Appendix A.1.

Thermal Generators

HourlyThermal_byGenerator.zip: contains hourly estimated thermal temperature derates by generator between 1998 and 2019, based on temperature data from the NSRDB and project-specific piece-wise linear derate functions. Each file corresponds to a project from EIA Form 860: [Plant ID]_[GeneratorID].csv. Timestamps are contained in timestamps.csv and are listed in hour ending, Pacific Standard Time. For more information about how this data was developed and used in the study, see Appendix A.2.

Three Levels

There are 3 different versions of the wind and solar generation profiles available in the archived data

Eventually I think we would like to be able to run this aggregation and data repair process within PUDL so that it could be adapted to different purposes. However, at the moment for the MVP we just need the final output. We can backfill the other steps later with better understanding.

One complication is that there are a small number of wind & solar projects which are "hybrid" -- they include energy storage as well as renewable generation. They have their own separate production curves, but may not be straightforwardly combinable with the pure renewable generation. Need to ask @anamileva & @elainekhart how to treat this data in relation to the other profiles.

zaneselvans commented 3 months ago

Additional Questions for @elainekhart & @anamileva

How are the BA-level renewable generation curves derived from the project (plant/generator) level curves? Are they just the capacity-weighted sums of the project-level capacity factors for all projects associated with a given BA?

In aggregating the project-level wind and solar data into BA level data, how do you deal with changes in the associations between plants and BAs? These could come from changes in the BA boundaries over time, or maybe for other reasons. Is it the case that the same projects can end up in different BAs depending on what year of data you're looking at?

If it's not a simple transformation from the project-level curves to the BA level curves, then for now we should probably just use the BA level curves. Which of those curves would we need? The solar/wind or solar_syn/wind_syn data? Or both?

Do the BA codes associated with these production curves correspond to the reported BA codes associated with the individual plants/generators which we would find in EIA860, or do they refer to the simplified / aggregated BAs that you created to deduplicate some data and consolidate many tiny BAs into a smaller number of big BAs?

Is there an explicit mapping stored somewhere that defines these aggregations by BA code or EIA IDs?

What are the Hybrid_Wind_* and Hybrid_Solar_* series? If we're providing the BA level production curves, should these also be made available?

zaneselvans commented 3 months ago

A couple of plots of average capacity factor by hour of day that looked a bit odd. For AZPS it seems like there's a storage component. And also a tiny bit of nighttime power consumption?

image image

zaneselvans commented 4 days ago

The last remaining issue here pertains to licensing, which is a bigger discussion that we should have outside of the context of this work, so this is done.