green-coding-solutions / eco-ci-energy-estimation

Eco CI Energy estimation for Github Actions Runner VMs
MIT License
48 stars 10 forks source link

Adding Carbon Intensity #54

Closed ArneTR closed 1 month ago

ArneTR commented 7 months ago

The carbon aware SDK publishes the Azure Grid Regions

Can we also get the information inside of Github in which region the VM is running? Is that a Github variable?

This would allow for integration CO2 values into the Eco-CI

dan-mm commented 7 months ago

Hmm its a good idea. It doesn't seem there's a way to get the region from github directly (source). it seems that the github public runners are hosted on Azure, and what they recommend is trying to figure out the location from the IP address.

chatgpt suggests using an external api services that gives you an estimated location:

❯ echo $(curl -s https://ipapi.co/$(curl -s ifconfig.me)/json/ | jq -r '.region')
Land Berlin

Or I can look through azure documentation and see if I can find a mapping of IP addresses to region.

ArneTR commented 6 months ago

Thanks for this input.

The question actually came up from this discussion: https://github.com/Green-Software-Foundation/real-time-cloud/issues/7#issuecomment-1839415045

Can you post a comment there, crosse referencing our issue here, to see if they maybe have relevant information on this? Seems like a more promising apporach as going through the docs i hope ... :)

mrchrisadams commented 6 months ago

Hey gang, I'm in the working group for Real Time Cloud, and have worked in this area before.

We build an importer here, and from memory Microsoft do publish information listing the IP ranges for each region and service.

It's on our roadmap to implement so we can expose this finer grained information in our own IP-to-Co2 API linked below: https://developers.thegreenwebfoundation.org/api/ip-to-co2/overview/

And you can see the issue here: https://github.com/thegreenwebfoundation/admin-portal/issues/189

I'm happy to raise this and discuss in more detail - this week is bad, but from Dec 14th onwards is better

ArneTR commented 6 months ago

This is lovely. We would like to contribute to your API then to bring the functionality in there.

Two questions on this:

ArneTR commented 6 months ago

Just for reference for us internally: Azure Regions to IP https://www.microsoft.com/en-us/download/details.aspx?id=56519

mrchrisadams commented 6 months ago

Why do you choose to use Ember over ElectricityMaps. Doesn't Electricitymaps have country / region API that is free nowadays and on a daily basis?

The data from Ember is open and easy to include in our own software, and package for redistribution. It's not as fresh as ElectricityMaps, but annual averages are more helpful in our scenario, and at the time of building I wasn't aware of a free country / region API with ElectricityMaps. This is a good a reminder as any to check if their terms have been updated though - ta.

Where does the code for the ip-to-co2 API live. In the admin-portal repo? Or is this only for the importer?

It's in the admin portal repo the specific class is below. It uses Django Rest Framework:

https://github.com/thegreenwebfoundation/admin-portal/blob/master/apps/greencheck/api/views.py#L36-L96

We don't do any clever IP to region mapping yet beyond ip to country info, but it's something we hope to pick up in early 2024 to add this extra resolution, and we'd be happy to chat about how to make any assumptions consistent.

I know that @adrianco has been doing some work to compile a spreadsheet normalised across all the providers with regions and carbon intensity figures, but it doesn't include IP range mapping. If the spreadsheet isn't in the public domain, it's only because it's being tidied up. I'll happily drop a message here when I know more.

rossf7 commented 6 months ago

Hey @ArneTR @mrchrisadams, it's great the region IP ranges are published but as an alternative the Azure Instance Metadata API might work for this.

https://learn.microsoft.com/en-us/azure/virtual-machines/instance-metadata-service?tabs=windows#endpoint-categories

It's a REST API exposed to Azure VMs on the IP 169.254.169.254 and the Azure region is in the location field. It also has the instance type in the vmSize field which could be handy too.

ArneTR commented 5 months ago

Idea: Bring in some form of CO2 value to Eco-CI

Implementation Routes:

Downside of both routes is loosing comparability. But we are ok with this as we still have the energy as a fixed value to be.

Given this desing consideration we see no benefit of using Ember Climate if the value is anyway not really "fixed".

  1. What we still need to solve is to determine if we can plug the lat/long that we get from the Github machine via a free API into CO2 Signal or if we should prefer to resolve the IP address from the GIthub Runner to a "Zone" from the Microsoft Azure list and the use the zone in the CO2 Signal API.

To evaluate this we need to look up some of the IPs in the microsoft document and compare their location to the Geo IP we get for isntance from

Update: The IPs seem to map to identical values as at least Maxmind tells. For instance 40.84.177.11 (curl icanhazip.com inside of the Github Runner) is "southcentralus" according to Azure (curl http://169.254.169.254/metadata/instance?api-version=2021-02-01 -H "Metadata: true" | jq). And Maxmind says San Antonio Texas (See: https://www.maxmind.com/en/geoip-demo)

@dan-mm Please research if there another good more or less free option. In the end the user has to plugin in a secret themselves for the APIs to work.

mrchrisadams commented 5 months ago

Hey @ArneTR - can you share a a sample of output you get from a machine when queryingf the lat long here?

What we still need to solve is to determine if we can plug the lat/long that we get from the Github machine

I didn't know that was possible, and if you have something like the actual lat-lngs for a datacentre region, I think there are more options available to you.

ArneTR commented 5 months ago

@mrchrisadams For Maxmind you can try online here: https://www.maxmind.com/en/geoip-demo

The lat / long I get for 40.84.177.11 is: 29.4227,-98.4927

mrchrisadams commented 5 months ago

Ah, so you're going via the IP, and then geocoding from "the outside" - you're not using any internal APIs, like querying for an an environment variable or setting local to the VM.

Thanks for clearing that up for me 👍

ArneTR commented 5 months ago

@mrchrisadams Just for completeness: There is an internal API but it only gets you the zone of the machine. That you have to somehow resolve to a datacenter and it's lat/long as Microsoft does this not for you directly.

We opted then for going through the lat/long of external providers directly as they seem to match quite good with the laborsome derived datacenter locations.

Here is the dump that you get from the machine internally in case you want to use a different field from the data:

{
  "compute": {
    "azEnvironment": "AzurePublicCloud",
    "customData": "",
    "evictionPolicy": "",
    "isHostCompatibilityLayerVm": "false",
    "licenseType": "",
    "location": "eastus",
    "name": "EUS-GHCUS1UB22EUSC8-0085",
    "offer": "",
    "osProfile": {
      "adminUsername": "",
      "computerName": "",
      "disablePasswordAuthentication": ""
    },
    "osType": "Linux",
    "placementGroupId": "",
    "plan": {
      "name": "",
      "product": "",
      "publisher": ""
    },
    "platformFaultDomain": "0",
    "platformUpdateDomain": "0",
    "priority": "",
    "provider": "Microsoft.Compute",
    "publicKeys": [],
    "publisher": "",
    "resourceGroupName": "ghcus1ub22eusc8",
"resourceId": "/subscriptions/ddf4fb74-17fd-4808-a484-2394c8e0264e/resourceGroups/ghcus1ub22eusc8/providers/Microsoft.Compute/virtualMachin
es/EUS-GHCUS1UB22EUSC8-0085",
    "securityProfile": {
      "secureBootEnabled": "false",
      "virtualTpmEnabled": "false"
    },
    "sku": "",
    "storageProfile": {
      "dataDisks": [],
      "imageReference": {
"id": "/subscriptions/ddf4fb74-17fd-4808-a484-2394c8e0264e/resourceGroups/ghcus1ub22eusc8/providers/Microsoft.Compute/galleries/ghcus1u
b22eusc8/images/factory/versions/0.0.54188457",
        "offer": "",
        "publisher": "",
        "sku": "",
        "version": ""
      },
      "osDisk": {
        "caching": "ReadOnly",
        "createOption": "FromImage",
        "diffDiskSettings": {
          "option": "Local"
        },
        "diskSizeGB": "86",
        "encryptionSettings": {
          "enabled": "false"
        },
        "image": {
          "uri": ""
        },
        "managedDisk": {
          "id": "/subscriptions/ddf4fb74-17fd-4808-a484-2394c8e0264e/resourceGroups/ghcus1ub22eusc8/providers/Microsoft.Compute/disks/EUS-GHCUS
1UB22EUSC8-0085-8ecd3cc8",
          "storageAccountType": "Standard_LRS"
        },
        "name": "EUS-GHCUS1UB22EUSC8-0085-8ecd3cc8",
        "osType": "Linux",
        "vhd": {
          "uri": ""
        },
        "writeAcceleratorEnabled": "false"
      },
      "resourceDisk": {
        "size": "14336"
      }
    },
    "subscriptionId": "ddf4fb74-17fd-4808-a484-2394c8e0264e",
    "tags": "OperatorOverridableTenantSettings.Tenant.Setting.BypassCmPeSyncForRepaves:True;SkipASMAV:true;SkipASMAzSecPack:true;SkipASMAzSecPa
ckAutoConfig:true;SkipLinuxAzSecPack:true;SkipWindowsAzSecPack:true;platformsettings.host_environment.service.platform_optedin_for_rootcerts:tr
ue",
    "tagsList": [
      {
        "name": "OperatorOverridableTenantSettings.Tenant.Setting.BypassCmPeSyncForRepaves",
        "value": "True"
      },
      {
        "name": "SkipASMAV",
        "value": "true"
      },
      {
        "name": "SkipASMAzSecPack",
        "value": "true"
      },
      {
        "name": "SkipASMAzSecPackAutoConfig",
        "value": "true"
      },
      {
        "name": "SkipLinuxAzSecPack",
        "value": "true"
      },
      {
        "name": "SkipWindowsAzSecPack",
        "value": "true"
      },
      {
        "name": "platformsettings.host_environment.service.platform_optedin_for_rootcerts",
        "value": "true"
      }
    ],
    "userData": "",
    "version": "0.0.54188457",
    "vmId": "f8fa9a50-744a-497b-ad9b-485bbd782d1c",
    "vmScaleSetName": "",
    "vmSize": "Standard_DS2_v2",
    "zone": ""
  },
  "network": {
    "interface": [
      {
        "ipv4": {
          "ipAddress": [
            {
              "privateIpAddress": "10.1.0.67",
              "publicIpAddress": ""
            }
          ],
          "subnet": [
            {
              "address": "10.1.0.0",
              "prefix": "16"
            }
          ]
        },
        "ipv6": {
          "ipAddress": []
        },
        "macAddress": "000D3A9DEB82"
      }
    ]
  }
}
ArneTR commented 2 months ago

Hey all,

the feature is now live!

You can see it for instance in this sample PR we have made for the falco folks to showcase the functionality:

https://github.com/green-coding-solutions/falco/pull/2

It gets written to the Pull Request comments and also aggregated in our Eco-CI Dashboard here: https://metrics.green-coding.io/ci.html?repo=green-coding-solutions/falco&branch=2/merge&workflow=87453986

Giving you all a ping and opportuninity to ask a question before we close this issue from our side.

@mrchrisadams @rossf7

mrchrisadams commented 2 months ago

YES BRUV THIS IS VERY COOL!

Here's the text I saw in the PR. It might be worth being explicit about the source of the data, because carbon intensity can be described a number of ways, and this appears to be a location based figure.

City: Boydton, Lat: 36.677696, Lon: -78.37471 Carbon Intensity for this location: 339 gCO₂eq/kWh CO2eq emitted for this job: 0.016625 gCO₂eq

I don't think this would account for any renewables purchased by firms who run infrastructure, and it might be worth linking to a some definitions, but it's definitely a thing you could add later - nice work!

ribalba commented 2 months ago

@mrchrisadams Totally. We use the data from https://www.electricitymaps.com/ but of course this doesn't take into account local differences. How would you express such a thing? I could add a little * as a link to our readme?

mrchrisadams commented 2 months ago

Hi @ribalba !

I had a quick look - if you're using Electricity maps, then this is the link I would use:

https://www.electricitymaps.com/methodology#carbon-intensity-and-emission-factors

There's also some more detailed guidance in the accounting guide for the super energy nerds - it goes into more detail about the different ways people talk about how clean energy is for carbon accounting purposes:

https://www.electricitymaps.com/reports-and-guides/accounting-guide

However, I think the first link above would suffice. That's what I would link to, anyway :D

ribalba commented 2 months ago

Hey @mrchrisadams ,

sorry for not getting back earlier I was moving houses last week. I added a little * to reference your link:

https://github.com/green-coding-solutions/eco-ci-energy-estimation/pull/64

how does this look to you?

@ArneTR please chime in if you have an opinion.