carbonfirst / CarbonCast

A system to predict hourly carbon intensity in the electrical grids using machine learning. CarbonCast provides average carbon intensity forecasts for up to 96 hours.
Apache License 2.0
32 stars 9 forks source link

Which countries could this be used for? #4

Closed mrchrisadams closed 1 year ago

mrchrisadams commented 1 year ago

Hi folks,

I came across this repo after reading the earlier Ecovisor paper and I wanted to ask - with this historical data being published by Electricity Maps now, what would be needed to run the same kind of analysys for the other parts of the world that have hourly figures?

https://www.electricitymaps.com/data-portal

I've tried to access the linked PDF in the readme file, but the connection seems to be timing out. Would you mind sharing another link I could access the PDF to read?

https://groups.cs.umass.edu/ramesh/wp-content/uploads/sites/3/2022/09/buildsys2022-final282.pdf

diptyaroop commented 1 year ago

Hi Chris, That's a great idea. Finding data outside US/EU is quite challenging, so we can use ElectricityMaps as a data source, build the forecasting models on top of it, and run similar analyses. Of course, the more data available the better; and we may need more than 2 years data for renewable-heavy regions to build good models, but this will be a good start. Currently, we are working on expanding the coverage to more US/EU regions than we originally had and providing forecasts in real-time. You can find the latest repo here: https://github.com/carbonfirst/CarbonCast/tree/v3.0_real_time_service

Regarding the PDF, please refer to this for the latest version of our paper: https://energy.acm.org/eir/multi-day-forecasting-of-electric-grid-carbon-intensity-using-machine-learning/

mrchrisadams commented 1 year ago

Thanks @diptyaroop, I read the paper - it was a good read :D

CarbonCast uses a hierarchical two-tiered forecasting approach based on machine learning, as shown in Fig. 3. The first tier uses a set of models, one for each generation source, to predict the electricity production from that source for the next 96 hours. The second tier takes these first-tier predictions along with weather forecasts to predict the hourly carbon intensity of electricity in that region for the next 4 days

So If I understand this correctly, you basically need:

  1. hourly historical production by generation source
  2. predicted weather for the region(s) you're forecasting for

I'm aware about the limitations and caveats listed in the paper, about not trying to model imports / exports and so on.

While there is now good historical open data published by Electricity Maps, it looks like it's consumption based open data, so not the production by generation source as this model relies on.

You can see the data for India for example here: https://www.electricitymaps.com/data-portal/india

I've dropped it into a browsable notebook with Observable - it's really handy, for exploratory work, but I now do not think it would be an input to this model. It might be useful for making comparisons though to see how accurate forecasts might be. https://observablehq.com/d/579174200a4ea214

For production figures, I think you'd need to look at the sources in the linked parser below that are queried to fetch these: https://github.com/electricitymaps/electricitymaps-contrib/blob/master/parsers/IN.py

I don't know if historical production figures are published anywhere. I do know there is an emerging standard for listing the kinds of generation though - it was covered in this presentation at the Linux Foundation Energy Summit in Paris, France recently:

https://www.youtube.com/watch?v=sum5C1pQWNo

There is also a spec emerging for reporting entities to publish, so getting actual governing or regulatory bodies to collate this data becomes easier and more predictable.

You can read more about this below: https://powersystemsdata.carbondataspec.org/

And the minutes are there for the meetings to follow along https://github.com/carbon-data-specification/Power-Systems-Data

diptyaroop commented 1 year ago

Hi Chris (@mrchrisadams),

Thanks for the detailed comment and for sharing all these resources. I will definitely check them out.

Regarding using ElectricityMaps data & future work:

  1. I think they have data for historical production by source & production-based CI as well. So, we can use that to build our initial models. However, getting data directly from the ISOs/grid operators is better & we are looking at resources like eMap parsers, EIA grid monitor, ENTSOE transparency platform, OpenNEM etc.
  2. One problem with directly using data from eMaps is that we cannot run forecasts in real-time. Again, as you said, parsers/fetching data directly from grid operators will be better.
  3. If there is an emerging standard for publishing production data in hourly granularity, especially outside US & EU, that would be great. Currently, many grid operators lag behind in reporting data or do not have hourly data available. eMaps uses some statistical models, but we don't know the details of those. In the carbonDataSpec GitHub you sent, I see mainly US & EU. Maybe they will publish data about other regions soon.
  4. We also plan to provide forecasts of consumption-based CI in the future. So, we can use data from eMaps for ground truth in future. Or better yet, if we can get data directly from grid operators, we can compare our forecasts with the eMap data.