Green-Software-Foundation / real-time-cloud

Other
49 stars 1 forks source link

How can cloud regions be mapped to electricity grids for carbon metrics? #7

Closed rossf7 closed 5 months ago

rossf7 commented 1 year ago

Cloud providers use region codes to identify the physical location of hardware. Often workloads can find the region they are running in using metadata APIs.

Carbon intensity APIs including WattTime and Electricity Maps support using geo location to identify the electricity grid in use. This is known as the balancing authority in the US.

https://www.watttime.org/api-documentation/#grid-emissions-information https://static.electricitymaps.com/api/docs/index.html#geolocation

The Carbon Aware SDK has a list of Azure regions and their geo locations which is very useful but it would be great to have the same for AWS and GCP.

https://github.com/Green-Software-Foundation/carbon-aware-sdk/blob/ede7ab7bd71b7837a87305047f6974d4061df1b9/src/data/location-sources/azure-regions.json

e.g. Azure westus region is located in California ISO Northern balancing authority

rossf7 commented 1 year ago

@adrianco @seanmcilroy29 This probably isn't the first problem we need to solve but having this metadata for AWS and GCP would be very useful.

Do you know if there any restrictions on publishing it?

I found this issue from Carbon Hack 22 to add the AWS regions. Maybe we can reopen? https://github.com/Green-Software-Foundation/carbon-aware-sdk/issues/183

rossf7 commented 1 year ago

For Google Cloud they publish the closest city to each of their regions. https://cloud.google.com/compute/docs/regions-zones

We could use a coarse geocoder (something like https://geocode.earth/blog/2019/almost-one-line-coarse-geocoding/ ) to get the location of each city. This should be good enough resolution to identify the correct balancing authority for each region.

missinglink commented 1 year ago

Hi Chris, Team

I think for what you're trying to do you can probably get away with something much simpler:

const fs = require('fs')
const turf = require('@turf/turf')
const data = JSON.parse(fs.readFileSync('Control__Areas.geojson', 'utf-8'))

// [longitude, latitude]
function search(point) {
  return data.features
    .filter(f => turf.booleanPointInPolygon(point, f.geometry))
    .map(f => f.properties)
}

console.error(search([-101, 47]))

I downloaded the data from https://hifld-geoplatform.opendata.arcgis.com/datasets/control-areas/explore

Screenshot 2023-08-31 at 14 55 50 Screenshot 2023-08-31 at 14 56 37
adrianco commented 10 months ago

The GCP dataset on Bigtable does include explicit info on which grid they are connected to, discussed here https://github.com/Green-Software-Foundation/real-time-cloud/issues/14

rossf7 commented 9 months ago

Hi @adrianco @mrchrisadams I looked into this some more and wrote some Go code based on our Green Web Foundation grid-intensity-go tool. For each region it gets both the Electricity Maps Zone ID and the WattTime Region IDs.

https://github.com/rossf7/cloud-region-to-grid-carbon-mapping

The results are in this Google Sheet. https://docs.google.com/spreadsheets/d/1own00fuI6tVimq3Y3XnPioqTRX0mixuxEkcKQmEmG8s/edit?usp=sharing

For Azure I found that the region coordinates in the carbon-aware-sdk are sourced from the Azure Region API and are also in the Azure CLI az account list-locations.

For Google Cloud and AWS I can't find a source for the geolocations so I used the OpenStreetMap Nominatim API based on the region descriptive names. This doesn't work for some regions like North Virginia so in those cases I added a manual override (e.g Richmond).

These coordinates should be granular enough to map to the correct grid without leaking any sensitive info.

Although of course it would be much better if the geolocations were published by the cloud provider!

@missinglink Thanks for the link to the dataset with the US grid boundaries. That is very handy and I've bookmarked for future use.

jawache commented 8 months ago

For Impact Framework we have to build a model plugin which takes the cloud region --> lat/lon which can then be used to get carbon intensity data from wartime/em or any other source.

I'd suggest lat/lon is the correct level of granularity to ensure broadest integration, it works with all services out there (wt and em and others) and also there are services that try to measure grid carbon intensity intra-grid, so the difference in carbon intensity on one side of a large grid vs another taking into account the grid infra to transmit energy, I forgot the name, begins with R, Microsoft uses them.

I'd suggest however we figure this out it's going to have to be a file maintained by the GSF, we've done similar things in the impact framework. Maintenaning our own meta data regarding cloud instance types that is required for measurement. Just a CSV file that we maintain, as long as that's actively used by say IF then there will be lots of loud complaints when a region is wrong or doesn't exist (to ensure the file remains updated).

rossf7 commented 8 months ago

@jawache @adrianco An IF model plugin for this makes a lot of sense to me.

As does having a CSV maintained by the GSF as the source of truth. Other providers could even submit their regions if they think its beneficial and it makes sense for IF to support them.

I would gladly help with collating the data and implementing the plugin if that is useful.

Should we close this in favour of an issue in the IF repo?

seanmcilroy29 commented 5 months ago

Group agreed to close Geolocation has been added to IF carbon-free energy