$100K GSF Funding For OSS Carbon-Aware API

will-iamalpine commented 3 years ago

Prototype here: http://azure-uw-cli-2021.azurewebsites.net/docs_page

Momentum

We have several internal MSFT pilots to use this tool, and intend this as an expression of the SCI spec we're developing.

Overview

To enable Microsoft and GSF stakeholders to make smart decisions about their environmental impact and carbon footprint, we have created the Carbon Aware API to minimize the carbon emissions of computational workflows. A few key features of the API are:

Utilizes marginal carbon emission from WattTime to identify carbon intensity per region
Generates a retrospective carbon emission analysis
Provides forecasted demand shifting suggestions
Supplies regional carbon intensity information over different scopes

Details and Methodology

Marginal Carbon Emissions: This grid-responsive metric with finer granularity than average emissions, allows for seasonality/diurnal trends captured in demand shifting (source).

Retrospective Analysis: Time series evaluation to assess the carbon emissions for a given energy profile. Also provides counterfactual analysis to expose the potential emissions of if the run had been shifted.

Demand Shifting: Recommends region and/or time which would yield a less carbon intensive run.
Temporal: Identifies the window of minimum carbon intensity for a specified run duration within a chosen region from a 24-hour forecast.

Geographic: Finds the region with the current lowest average carbon intensity for an immediate run of a specified duration. Can filters available regions by available SKU and migration laws for workspaces with protected data.

Regional Carbon Intensity: Provides the carbon intensity for each data center supported by a WattTime- tracked balancing authority. The possible scopes are historic intensities (time series for prior 24-hours, week, and month), real-time marginal intensity, and forecast (mean intensity for upcoming user- defined window).

Funding/Support Needed: $100k

Scheduling and Logging: Link to Global Job Dispatcher for carbon-aware chron-scheduler for ML workspaces. Need to create a logging system to track recommendation uptake and performance.

Refactoring: Currently built within a Flask framework. Need to refactor endpoints for improved readability, latency reduction, and robustness. Need to add testing and CI/CD for improved engineering standards. In progress, with completion in September.

OSS and Build to Scale: Refactored script needs to be converted from Python to a lower-level language in order to scale. To release the Carbon Aware API as a viable open-source scheduling tool with support for multiple input sources and infrastructure agnosticism, additional resources are needed to structure in a language that is compatible with the Azure stack (ex: C# or C++).

OKR’s

Objective: As an expressiona tool implementing of the GSF’s software carbon intensity (SCI) specification, we seek to build community engagement and awareness through an OSS carbon-aware API that can be extended to other cloud providers & data sources.

Key Result: Hosted API that is capable of handling client requests at scale, that enables impactful carbon reductions in line with the methodology of the SCI.

Goal with getting to OSS

The carbon-aware API becomes the standardized way to enable (change behavior of) developers to time- and region-shift their computing loads to generate carbon emissions savings. This extensible toolkit is multi-cloud compatible, and enables cron-based scheduling of workloads.

This standardization can be through an approach to implement the Software Carbon Intensity (SCI) from the Green Software Foundation (GSF) as a starting point and perhaps being flexible enough to support other standards as well in the future. Though hopefully there won’t be too many!

Some requirements that we need to meet to get to a solid OSS

Based on the codebase that we have now and the discussions with Taylor, some of the key things that I think we’ll need to address are as follows:

Engineering Standards

Stylization of code including adherence to PEP8 for Python code
CI/CD to be setup so that all code passes through tests and quality checks (black, flake8 as an example) before being merged
Documentation of the codebase – automated documentation generation using Sphinx for Python for example
Refactoring as indicated in the document shared
Robustness of code

Testing (unit, system, and integration) Flexibility of code to be adapted by the OSS community

Modularity of code to tackle:

Diverse of data input sources for energy data (e.g. WattTime, ElectricityMap, etc)
Diversity of job dispatchers
Custom front-ends
Diverse cloud providers (e.g. MSFT/AWS/GCP)

Tracking generated impact

Carbon-counterfactual: This is something that I had discussed with Taylor in terms of us building in instrumentation and gathering telemetry as to what action the user took so that we know whether the suggested action was something that they found useful / acted on or not

This will require additions in the UI as well where we have:

A running counter showing something like: “27 other users shifted their computing loads in the past hour and saved 512 kg CO2eq, that’s equal to 3 fewer cars going from X to Y”

This is going to be crucial if we want to trigger behavior change. Why this is on this list of getting to OSS is that it will help demonstrate more strongly the usefulness of this tool in terms of hard numbers and help us publish aggregate numbers which can drive more interest in the project and drive its uptake as well by developer communities across various regions and platform choices.
This will also help anybody who is trying to make a business use case for funding this to have more concrete data in terms of actually triggering change and how many people are using it.
If we go down this path, we might need to have a centrally hosted analysis component that gathers all this telemetry from the various users and sends back aggregated stats to those using the tool to provide the notification as mentioned above.
Moving over the project roadmaps to “GitHub Projects”

Contribution guidelines

Licenses

I think v2 of the OSS can migrate to a different language for scalability, it would be good to take this codebase in Python that we have now as far as we can to glean other insights in terms of what the target audience might desire from it first and then take those insights to build something in C#/C++ as indicated in the other document.

atg-abhishek commented 3 years ago

Was there any discussion on this in the previous weekly meeting @buchananwp ?

will-iamalpine commented 3 years ago

@atg-abhishek thanks for checking. We presented this today in the call. Next steps:

Write one-pager sumary document per the new template
Solicit 2 additional companies from GSF as partners
formalize project!

atg-abhishek commented 3 years ago

This is great @buchananwp - let me know which of the steps I can help with :)

pierlag commented 3 years ago

move to project template and validate by WG

Green-Software-Foundation / opensource-wg