Green-Software-Foundation / hack

Carbon Hack 24 - The annual hackathon from the Green Software Foundation
https://grnsft.org/hack/github
14 stars 1 forks source link

Azure Sustainability API Plugin #55

Open DannyvanderKraan opened 5 months ago

DannyvanderKraan commented 5 months ago

Type of project

Building a plug-in for Impact Framework

Overview

The primary objective of this plugin is to seamlessly integrate data from Microsoft's Sustainability API for Azure into the Impact Framework. Microsoft's API provides detailed metrics on carbon emissions and resource usage specific to Azure cloud services. By incorporating this data into the IF, users can gain a more comprehensive understanding of the environmental impact of their software applications on Azure.

Questions to be answered

Can I write the plugin in C#.Net? Does the build need to be a DLL, executable, or something else? What does the interface between IF and a plugin look like? What kind of data would be most beneficial to IF?

Have you got a project team yet?

No, but we will find people ourself

Project team

No response

Terms of Participation

srini1978 commented 5 months ago

@DannyvanderKraan Couple of points:

The Sustainability API today returns consolidated data across all scopes- Scope 1, Scope 2 and Scope 3.

Reading through the calculation methodology and especially this link https://learn.microsoft.com/en-us/industry/sustainability/api-calculation-method#calculation-methodology-for-scope-3 we can understand that the Scope 3 emissions that sustainability API returns is based on an allocation mechanism that is rate card based and is assigned at a specific service level (Azure VM, Azure SQL etc).

Scope 2 emission allocation to a particular service is based on https://learn.microsoft.com/en-us/industry/sustainability/api-calculation-method#customer-attributions-and-calculations-for-carbon-emissions

The article does not talk about how Scope 1 emissions is allocated at a service level but by theory we can assume that this will never happen.

Now coming to software impact on emissions

If you take an Azure first party service like VM, Azure SQL the impact framework is focused on calculating emissions from workloads running on these services and hence is technically 3 things:

  1. Energy consumed by the software on the 1st party service (E). - Scope 2
  2. Electricity drawn from the grid by the 1st party service for running the workload (I). - Scope 2
  3. Pro rata calculation of allocation of hardware to the service (M) - iScope 3

Hence if we want to implement Sustainability API for software emissions there will be couple of challenges or points to check:

  1. We need to consider only Scope2 and Scope 3 emission values from sustainability API.
  2. Today the sustainability API provides emissions at an Azure service level i.e for all Virtual machines in a subscription aggregated. But if you have workloads running on a subset of these machines, we will not get emission values for those specific VMs. Also, if the VM is shared between more than one workload, we will also not be able to get the emissions for those specific workloads.
  3. In my mind, for the sustainability API to work successfully as a model plugin, we need observations at a specific workload level. For example since the API runs based on rate card algorithm, can we get the rates associated with the specific workload. this will essentially be the cost management API of Azure and we need to connect to it and get the data at the granular level that we want.

There is definitely potential value we can derive by taking the outputs from Sustainability API and Cost Management API and running them through the computation pipeline (aggregation) of Impact Framework. We can discuss more on this but certainly an interesting point of view,

FYI @jawache @jmcook1186

DannyvanderKraan commented 5 months ago

@srini1978 To make sure I understand your reply. The main point of your response is to integrate Microsoft's Sustainability API with the Impact Framework for the purpose of efficiently calculating software carbon emissions. The key points I got from your response are:

Scope of Emission Data: The Sustainability API provides data on all three scopes of emissions:

However, the API currently does not specify how Scope 1 emissions are allocated at the service level, while the Scopes 2 and 3 methodologies are provided.

Focus on Scope 2 and Scope 3 for Software Impact: When evaluating the carbon footprint of software running on Azure services (like VMs or Azure SQL), the Impact Framework primarily considers Scope 2 and Scope 3 emissions. These involve:

Challenges with Current Sustainability API Data:

The API provides emission data at an aggregate Azure service level, not the individual workload level. If a VM hosts multiple workloads, separating emissions data for each specific workload is challenging. Potential Solution and Value Proposition: Obtaining emission data at a more granular, workload-specific level is necessary to make the Sustainability API more effective for the Impact Framework. This could involve integrating the Sustainability API with Azure's Cost Management API to get detailed data. By combining data from both APIs and processing it through the Impact Framework, you can potentially derive more accurate and meaningful insights about the carbon efficiency of software.

In summary, your response highlights the need for more detailed, workload-specific emission data from the Sustainability API to accurately assess and improve the carbon efficiency of software running on Azure services, aligning with the goals of the Impact Framework. Did I understand your reply?

srini1978 commented 5 months ago

yes @DannyvanderKraan . You summarized it well

DannyvanderKraan commented 5 months ago

@srini1978 A lot can change in a couple of days. Forget the Sustainability API as you know now and meet the all-new Carbon Optimization Service on Azure (type "carbon optimization" in your search bar on the Azure Portal). They silently released it last week. The docs are not ready yet: https://learn.microsoft.com/en-us/rest/api/carbon/ I'm excited because the new Carbon Optimization Service lets you drill down to individual Resources on Azure instead of general Resource Types! image

There's another Whitepaper on calculating scope 1, 2, and 3 emissions: https://go.microsoft.com/fwlink/p/?linkid=2161861. The Scope 1 emissions will be much less interesting than Azure's Scope 2 and 3 emissions. And for the consuming party, the scope 1, 2, and 3 of Azure will be their scope 3 anyway. So, for now, focussing on the big wins, we can figure out scope 1 later.

I'm only looking into the authentication/authorization flow because it only seems to support implicit currently. I will get back to you on that.

DannyvanderKraan commented 4 months ago

@srini1978 OK, I was messing about yesterday. I just called the API via a console app with a client ID and a client Secret. So that is all working fine! To be continued... But what are you thinking about this proposal now?

jawache commented 4 months ago

My memory of Azure grouping is failing me :) but I think it's Subscription > Resource Groups > Resource right?

So, a Resource Group is effectively what I used to define a group of services; I used to use Resource Groups to define an "Application".

What if the observation here was just timestamp/duration @srini1978 and @DannyvanderKraan? We used IF as just a way to group and time bucket the exported data from the API so that other plugins can use it or we can analyse the data in interesting ways.

So maybe a manifest like below would just query the API for the resource-group YYYY and grab the emissions for those 4 different time buckets separately.

The output should be carbon and if that worked then you cold use it with other plugins like the sci to generate a sci score or just many others to do different things with carbon.

pipeline:
  - azure-carbon
  - carbon-offsets # a plugin that takes carbon and figures out how many offsets to buy.
  - sci
config:
  azure-carbon:
    subscription-id: XXXX
    resource-group: YYYY
inputs:
  - timestamp: 2024-01-01T00:00
    duration: 3600
  - timestamp: 2024-01-01T01:00
    duration: 3600
  - timestamp: 2024-01-01T02:00
    duration: 3600
  - timestamp: 2024-01-01T03:00
    duration: 3600
DannyvanderKraan commented 4 months ago

Hi @jawache,

Subscription > Resource Groups > Resource is correct. Thanks for suggesting the data observation for the Green Software Foundation plugin. Your manifest example is helpful to me. What constitutes a group of services or an "Application" is entirely up to the user. Resource Groups are only sometimes used like that. Sometimes, they are used to group resources based on location or costs. So, there's no way to tell. However, I am toying with introducing a separate hierarchical level, which Azure users can use to logically group resources together to define "Applications" or "Workloads." For a first pilot, we could use the Resource Group, though. I think that's close enough for now.

I'm excited about integrating the plugin into a broader sustainability pipeline. Could we chat more about how to make our plugin compatible with others like carbon-offsets and SCI? I think there’s a lot of potential here for making a real impact.

Would you be up for a quick meeting to dive deeper into this? I'd love to get your take on technical details and any advice you have as we move forward.

I am looking forward to your thoughts!

srini1978 commented 4 months ago

Pls include me if possible. happy to help as well.

jawache commented 4 months ago

Hey @DannyvanderKraan, I forgot to mention this earlier, but if you want to chat, perhaps you'd like to join our weekly live stream sessions, and we can discuss on there? The next one is tomorrow (monday) at 2:30pm GMT. email me at asim@greensoftware.foundation if you're interested in coming online and chatting for 15/20 mins?