Regional Averages vs. Granular Carbon Intensity: What Is The Delta? (`I`)

will-iamalpine commented 3 years ago

For carbon intensity calculations, is a regional average sufficient, or do we need the real-time carbon intensity of that electricity source?

Question: What do we gain/lose by accounting with these two methods?

e.g. for a given datacenter region, perhaps location-based carbon intensity

Potential application: carbon accounting to the real-time information to estimate the potential error introduced by the delta in averages (regional averages vs. granular).

Data needed: hourly demand curve for the source (e.g. Azure, windows) to appropriately capture the seasonality

Proposal/expected outcome of the analysis: a comparison of the averages currently used for Company carbon accounting and real-time data sources (WattTime, ElectricityMap) using historical demand curves (Azure/Windows) that capture seasonality. We hope this will better capture progress and highlight paths toward a consistent methodology.

Henry-WattTime commented 3 years ago

Could we use the data traditionally used by corporate carbon accounting, i.e. eGrid to show the transition from annual average to granular marginal?

vaughanknight commented 3 years ago

Question on the definition of a "region" for regional average, if it is a datacentre region that is facilitated by multiple datacentres, they can be 1000km apart with different energy sources - usually by design for redundancy purposes etc.

If we are talking geographical regions - what is the size?

Apologies if this is defined in the dictionary. If not I may raise it for the dictionary

atg-abhishek commented 3 years ago

This is a great question @buchananwp - a follow-up: if I think back again to the core purpose of the SCI, when thinking about the gain/loss, I'd perhaps think of crafting an experiment to figure out whether the higher granularity leads to more of a change in behavior? Ultimately, if we can change behavior even with non-granular data, then the impact would be the same. Though, as I write this I realize that there is a bit of circularity to it in terms of being able to assess the impact we would need maybe more granular data.

@vaughanknight we don't actually have that in, so please do make that addition. cc @seanmcilroy29

jawache commented 3 years ago

What is the granularity of data a team would use to make their application "carbon aware"? We have to snap to the same.

jawache commented 3 years ago

The words "regional average" threw me a little. I think after reflection the discussion is about the time granularity right? What's the difference between using yearly, monthly, daily, hourly, real-time (as fast as possible) marginal carbon intensity data?

Based on the conversations in https://github.com/Green-Software-Foundation/software_carbon_intensity/issues/22 I know understand that

If you have real-time data, you will be able to do time-shifting and region-shifting.
If you have yearly, monthly, daily you will still be able to region-shifting but time-shifting becomes impossible or just pointless.

Is my understanding correct?

Henry-WattTime commented 3 years ago

@jawache, yes you are correct, fine grained real-time data is needed for time-shifting, but region shifting can still be done if you only have annual/monthly/daily data. I was thinking that if someone has the capability to shift locations but no granular data, the standard should support/allow that?

Henry-WattTime commented 3 years ago

@vaughanknight I (at WattTime) use 'region' to refer to grid regions. We use grid regions that align with grid control areas (which may or may not follow political, utility or other boundaries) often called Balancing Authorities or Independent System Operators in the US. Some data sources actually measure emissions at the nodal level. If a data center regions straddles data centers in multiple grid regions, then it may be difficult to accurately measure the emissions of computation without knowing the actual datacenter location. Does that make sense?

atg-abhishek commented 3 years ago

To the point of coarse data and time-shifting, for batch jobs like accounting jobs, or other housekeeping and admin functions, those that have flexibility could still benefit from it @jawache ?

atg-abhishek commented 3 years ago

What is the granularity of data a team would use to make their application "carbon aware"? We have to snap to the same.

Actually @jawache maybe we want to provide them with the incentive to snap out of the way they are functioning and the granularity of data that they are using. Specifically, it might be something along the lines of them opting to go coarse for ease when more granular data is available in which case a nudge somehow from the SCI can move them towards more granular data use and presumably more impactful actions?

jawache commented 3 years ago

Thanks for the clarification. Ok, I understand much better now.

My current thinking then is that given that you can still be carbon aware with yearly averages, and yearly averages are freely available then that should be the baseline. We can encourage people to use more granular data if they have access to it, but since it costs money to get granular data and for some regions it isn't available, insisting on it would put the SCI out of reach for a lot of products and people.

We can carefully word the spec to say it "SHALL use yearly averages but SHOULD use more granular data if it is available" - Shall means do it or else, Should means try if you can.

atg-abhishek commented 3 years ago

Yup there is a very consistent use of the terms like "SHALL" as mandated by the LF guidelines that we can leverage :)

atg-abhishek commented 3 years ago

@vaughanknight if you'd like to make a contribution to the GSF, I've opened that up for you here: https://github.com/Green-Software-Foundation/Dictionary/issues/3

Green-Software-Foundation / sci

Regional Averages vs. Granular Carbon Intensity: What Is The Delta? (`I`) #27