Carbon Savings Quantification: Counterfactual & Delta

will-iamalpine commented 3 years ago

Flagging this as a high-level area to address: When we implement carbon-savings methodologies, how do we track/report it? This should be part of SCI equation itself, and will lay out clearly how you capture, quantify, and report the SCI savings due to your GSE Action. This metric is used to measure the success of Green Software Actions/practices.

By measuring SCI score deltas for different choices, we can nudge user behaviour in positive directions:

(developers) to track the impact of a given GSE Action
(end user) to help understand the impact of their behavior choices. The SCI spec could define how exactly you quantity and report those savings to the end user
(companies) to start tallying carbon savings for implementation of GSE actions

This extends to carbon aware, "eventually" I think carbon aware features will be checkboxes selected by customers "do you want this workload to run carbon aware y/n?. We're currently debating two paths/terms:

Carbon "Delta"This is a retrospective delta between two SCI scores. (e.g. load shifted by X time, resulting in Y measured savings)
Carbon "Counterfactual" - This hypothetical capability tracks are if the suggested green runtime was accepted by the user or not and the carbon reduction for each of these. This way we can get a singular statement of, "Over X predictions made, users on average reduced their carbon footprint by Z %"

### - Example of application:

ML Job/workload shifting: we can compare this carbon against a counterfactual by applying the methodology (energy consumption, location-based marginal carbon intensity) against the prior action. e.g. 'by shifting your workload to a greener region, you saved X% on your carbon emissions, for a total of Y emissions reductions" compared to your original action
Windows update at green times of day
Eco-mode: Ask customers 'do you want to run this in eco-mode'? When this box is checked,
Carbon-Aware Libraries - TensorFlow leverages an optional library that uses a new green runtime. Usage of this library is tallied across all actions it could be applied against

@TaylorPrewitt is driving

TaylorPrewitt commented 3 years ago

Keeping things in proportions, you can get "Over X predictions made, users on average reduced their carbon footprint by Z %" without needing to store any run/workspace metrics supplied from the client. This way impact can be tracked and compliance maintained. By performing counterfactual checks on CI without energy metrics, you forego the hard 'lbs of carbon saved' value, but do bypass problematic compliance and data storage issues.

atg-abhishek commented 3 years ago

@TaylorPrewitt could you expand a bit more on the specific compliance and data storage issues aspect?

atg-abhishek commented 3 years ago

Thinking about C and CI can also help us think more about Jevon's Paradox

TaylorPrewitt commented 3 years ago

@TaylorPrewitt could you expand a bit more on the specific compliance and data storage issues aspect? To get a counterfactual mass value for carbon emitted, the energy profile from the client's run as a result of a smart scheduler would need to be acquired. From my perspective, this can happen via a couple different ways, 1) Client returns to the tool after the run has been scheduled and executed to submit an energy profile. 2) The client workspace auth creds are stored such that the tool can automatically grab an energy profile once the the scheduled run is complete.
Number 1 is just very unlikely and not dependable to get a real value and measurement of impact. At best if it becomes the norm to go through this process, it violates the 'be frictionless' necessity of GSF Standards. Number 2 brings many security issues especially when dealing with workspaces which have added dimensions of governance. Getting C for all scheduled jobs would be problematic due to getting/keeping energy data from clients workspaces. But by knowing the runtime and location (provided by scheduling tool) a counterfactual value for reductions/impact can be found which bypasses the need for any additional input. This would enable "Over X predictions made, users on average reduced their carbon footprint by Z %" However, bouncing off of todays discussion about baselines (especially important for OSS software), if a baseline for R is established (recall CI =C/R), an estimated total C would be able to be tracked via tracking the CI reductions.

TaylorPrewitt commented 3 years ago

This does start branching into the domain of Issue #40

jawache commented 3 years ago

This is an interesting use case, I suspect it's only really possible right now with certain specific types of cloud workloads (ML) and also it is only really useful when talking about carbon awareness.

It also only makes sense in the context of an "action" and not in the context of a "product". So you could calculate the counterfactual for taking the action of delaying a machine learning run 1 hr into the future. You could not calculate the counterfactual for "Windows", it just doesn't compute in the context of a product.

I don't see this as a mandatory aspect of the spec feeding into the calculation of CI. I see this more fitting in as a pattern of green software development which we should evangelise separately?

will-iamalpine commented 3 years ago

My intent with this PR is to ensure GSE savings work are recognized. I see this as the difference between two CI figures (starting/baseline). Rather than a pattern, I see this as an essential (and optional) path to tally savings.

atg-abhishek commented 3 years ago

@buchananwp does #46 now sufficiently address this?

atg-abhishek commented 3 years ago

@buchananwp can we close this issue out or is there something from this that still needs to be discussed? Thanks!

will-iamalpine commented 3 years ago

Closing, we addressed it in 'carbon delta' section of the spec

Green-Software-Foundation / sci

Carbon Savings Quantification: Counterfactual & Delta #35