Green-Software-Foundation / standards-wg

GSF Standards Working Group
Other
30 stars 4 forks source link

[Project Proposal] PRFAQ for real time carbon metrics #67

Closed seanmcilroy29 closed 10 months ago

seanmcilroy29 commented 1 year ago

[THIS IS A PRE-DRAFT PR-FAQ; EVERYTHING HERE IS SPECULATIVE, NO CLOUD PROVIDERS HAVE AGREED TO DO ANYTHING YET]

AMAZON, MICROSOFT AND GOOGLE JOINTLY ANNOUNCE SUPPORT FOR GREEN SOFTWARE FOUNDATION STANDARDIZED REAL-TIME ENERGY AND CARBON METRICS

Carbon measurement reports move from monthly totals to minute-by-minute metrics, allowing real-time feedback and optimization of cloud workload carbon footprints by providing information that is otherwise only available to datacenter workloads.

Seattle, Washington–October 17th, 2023 – The three leading cloud providers have agreed to provide real-time energy and carbon metrics according to the Metrics for Energy and Carbon in Real-Time Standard (MEC-RT) defined by the Green Software Foundation.

Cloud providers are the largest purchasers of renewable energy in the world, but so far they have provided their customers with carbon information on a monthly basis, a few months in arrears, so customers have had to produce their own real-time estimates for cloud workloads, using public information that doesn't include those purchases and overestimates carbon footprints. As part of the information technology supply chain, cloud providers need to supply real-time carbon metrics that can be aggregated by workload, allocated and apportioned through the supply chain to satisfy regulations that are in place in Europe and California, and emerging elsewhere. Cloud providers build their own custom silicon and systems designs, and optimize them for low power consumption and to reduce the carbon footprint of their supply chain. Using MEC-RT the efficiency benefits combined with the renewable energy purchases of cloud providers can be compared directly to datacenter alternatives for specific workloads.

Many software as a service (SaaS) providers run multi-tenant workloads on cloud providers. To supply their own customers with carbon footprint estimates, the instance level data from MEC-RT needs to be allocated and attributed across workloads. The Kepler project hosted by the Cloud Native Computing Foundation allocates the energy usage of a host node to the active pods and containers running in that node, so that energy and carbon data can be reported for workloads running on Kubernetes. In datacenter deployments Kepler can directly measure energy usage and obtain carbon intensity data from the datacenter operator. Cloud providers block direct access to energy usage metrics as part of their multi-tenant security model, but can safely provide energy data to Kepler via MEC-RT at one minute intervals.

The carbon intensity of electricity obtained from the grid depends on location and varies continuously, but estimates are available on an hourly basis. These have been used for so-called "24x7 Location Model" monthly carbon reports by GCP in particular. However these estimates don't take into account private power purchase agreements (PPAs) where cloud providers have their own supply of renewable energy. The alternative is to report data based on the energy that has been purchased according to the so-called "Market model" which includes PPAs, and is the basis of the AWS and Azure monthly reports. MEC-RT includes both of these standard reporting models, and AWS, Azure and GCP all plan to report data using both models.

Energy usage is defined as Scope 2 by the greenhouse gas consortium standard. There is also a small amount of Scope 1 fuel burned in backup generators, in heating buildings, and by staff commuting to work. Scope 3 reports on the supply chain including silicon and computer hardware manufacturing, transport, datacenter construction, and recycling. The proportion of renewable energy is increasing over time, and as a result Scope 3 is tending to dominate carbon footprints. All three scopes are reported by MEC-RT.

The current monthly reports are delayed by several months so that there is time to gather accurate data in all regions around the world for a definitive report. In order to provide data in real time, preliminary estimates of the carbon intensity and supply chain data need to be supported. MEC-RT reports energy as a single value, but uses a confidence interval and a most likely value for the carbon footprint of each scope. As better carbon intensity data becomes available over time, the energy data can be re-processed to produce new carbon data, and the confidence interval narrows. The same metric schema can be used to produce MEC-Monthly roll-up data that isn't useful for optimization, but is well suited for carbon audit reports.

Customers and partners will access the MEC-RT metrics as time-series data via the cloud provider's default metric interface: AWS CloudWatch, Azure Monitor, and Google Cloud Monitoring. For Kubernetes Kepler will export to Prometheus on all cloud platforms.

Amazon VP Sustainability Kara Hurst said [MADE UP QUOTE]"Our customers and partners asked us for detailed information on the carbon footprint of their workloads in a standard format, as they optimize for upcoming regulations and deliver on The Climate Pledge, and we're happy to be working with the GSF and cooperating with our colleagues at other cloud providers to meet this need."

Google VP Sustainability Kate Brandt said [MADE UP QUOTE]"Google pioneered the hourly 24x7 carbon measurement capability, to support optimizations in time and space, we're very happy to extend this into a standardized minute by minute data feed that is optimized to support Kubernetes based workloads".

Microsoft VP Sustainability Melanie Nakagawa said [MADE UP QUOTE]"When we helped launch the Green Software Foundation our intent was to collaborate across the industry to come up with standards that our partners and customers can use to reduce their carbon footprint. We're very happy to support this real-time data feed, and to provide the first reference implementation as a proof of concept".

Harness CEO Jyoti Bansai said [APPROVED REAL QUOTE] _"We always wanted to provide our customers the ability to view their carbon footprint in the context of their cloud cost spend and idle/unused resources across all cloud providers. By ingesting the MEC-RT data, we may finally be able to get the information we need in a standard form"_.

Salesforce VP Sustainability Patrick Flynn said[MADE UP QUOTE]"Salesforce is dedicated to using its full power to save the planet, and that means we need to be able to measure and optimize our own workloads, and to be able to tell our customers what the carbon footprint of their use of Salesforce amounts to. In the past we've used crude carbon footprint estimation methods, and we're excited to be able to give much more precise and actionable data to our engineers and customers".

CloudZero CEO Erik Petersen said [MADE UP QUOTE]"As a leading cloud cost optimization tool that works across all the primary cloud platforms, we're delighted to see this new standard emerge, and look forward to bringing carbon optimization capabilities to our customer base".

To learn more, go to https://greensoftware.foundation/projects and to see the MEC-RT specification see [MADE UP URL] https://github.com/Green-Software-Foundation/mec-rt/.

FREQUENTLY ASKED QUESTIONS

Question: Why are the quotes made up?

Answer: The quotes are initially intended to indicate how we think key supporters will react to this announcement. The people are real, but the words are suggested. As this document is shared and refined, they will be replaced by real quotes. Jyoti Bansai of Harness approved his quote. Erik Petersen has been asked for a real quote. Other people mentioned have not been contacted directly, although versions of this document have been supplied to AWS, Azure and GCP.

Question: Why do cloud providers need to support MEC-RT? Should other cloud providers implement it as well?

Answer: The underlying information is only available internally at cloud providers, and there needs to be a common mechanism to share it, so that customers can measure the carbon footprint of their workloads, and so that cloud workloads aren't at a disadvantage compared to datacenter workloads. We encourage all cloud providers to adopt MEC-RT.

Question: How does MEC-RT relate to other Green Software Foundation standards like Software Carbon Intensity (SCI)?

Answer: MEC-RT is needed to obtain underlying carbon measurements that are then apportioned to transactions and other business metrics so that SCI can be calculated for a cloud based workload.

Question: What are the security issues around energy measurement?

Answer: There is a class of attacks that use very accurate measurements of CPU energy use to detect the different code paths that decryption algorithms take when they check whether keys are valid, and these can be used to break the algorithm. In addition, in a multi-tenant platform there may be more than one customer workload sharing a physical host, and the energy usage of that host is affected by the total workload in ways that break the strong isolation guarantees made by cloud providers. By providing energy data summaries at one minute intervals the energy data is good enough for carbon estimation, and if necessary can be dithered to mask any signal that could possibly cause security issues.

Question: What metric format does MEC-RT use?

Answer: MEC-RT uses the same OpenMetrics standard for metrics as Prometheus and other recent tools. Each data point consists of a timestamp, a metric, and name/value pairs that describe it. Metrics consist of metadata such as name, type, units, and a stream of data points. https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md

Question: What carbon footprint information is currently available from cloud providers?

Answer: Monthly totals are provided by AWS, Azure and GCP, with varying levels of detail. Currently, AWS and Azure only provide data using the Market Model, and GCP only provides data using the Location Model. This is suitable for audit reports, but not useful for optimization tools and projects, and doesn't give enough detail to allow allocation and attribution for SaaS providers to pass on carbon footprint data to their customers.

Question: Why are MEC-RT carbon metrics reported as a confidence interval, how does that work, and how should they be produced and consumed?

Answer: The input data comes from many sources of varying quality, in particular as cloud regions are scattered around the world, there are different standards and interfaces for obtaining carbon intensity from the grid, as well as high variability over time in some areas. Where estimates are being produced, the most likely value is reported, but in addition a confidence interval provides a separate upper value and lower value with 95% confidence that the actual value is in that range. When computing with imprecise data, a common technique is to use Monte-Carlo methods, which work with distributions as inputs or outputs that can be specified using these three values. In regions that have very low carbon grids like France (Nuclear) or Sweden (Hydro), the variation is low so the confidence interval will be narrow. In regions that rely on solar and wind backed up by carbon based generation, there will be a much wider confidence interval.

Question: Why are confidence intervals also used for Scope 3 supply chain carbon metrics?

Answer: For scope 3 supply chain data, there are a lot of unknowns and estimated values, as well as batch to batch variation in builds of otherwise identical hardware. As the data sources improve, confidence intervals will narrow over time.

Question: How can optimization algorithms use confidence intervals?

Answer: For statistically valid comparisons between two values, the values are only significantly different if they have non-overlapping confidence intervals. So an optimization algorithm should treat input metric confidence intervals that overlap as not significantly different, and try to generate results that don't overlap before claiming success. The energy metrics provide a more precise value to optimize for.

Question: What is the California supply chain rule?

Answer: The rule is in progress as of June, but should be settled one way it the other by October, which is the suggested date of the PRFAQ. https://www.motherjones.com/environment/2023/06/california-bill-climate-corporate-data-accountability-supply-chain-carbon-emissions/

nttDamien commented 1 year ago

Interesting. I noticed: https://github.com/Green-Software-Foundation/mce-rt/ link points to Carbon Aware SDK, and https://github.com/Green-Software-Foundation/mce-rt/ gives me a 404.

mvaltas commented 1 year ago

Would it be possible to cite a reference to the assertion that cloud providers are the largest purchasers of renewable energy?

adrianco commented 1 year ago

The text that Sean included in this proposal above is from a previous version of my proposal that was shared privately over Slack. The latest version was submitted publicly via this pull request to the pr-faq section of the GSF site. https://github.com/Green-Software-Foundation/pr-faqs/pull/10/commits/887177bb388bde1d7b0eacd9735c35f1f90f6648

There are minor updates and clarifications.

adrianco commented 1 year ago

Would it be possible to cite a reference to the assertion that cloud providers are the largest purchasers of renewable energy?

CEBA tracks this - for 2022 this story shows additional capacity, they probably have a link somewhere to the total capacity, however Amazon has been saying for a while that they are the largest corporate buyer of renewable capacity in the US, in Europe, and in the world. https://cebuyers.org/blog/ceba-energy-customers-announce-record-high-of-nearly-17-gw-of-clean-energy-in-2022-despite-policy-and-market-challenges/

adrianco commented 1 year ago

Interesting. I noticed: https://github.com/Green-Software-Foundation/mce-rt/ link points to Carbon Aware SDK, and https://github.com/Green-Software-Foundation/mce-rt/ gives me a 404.

Yes, that's because it doesn't exist yet. There's also a typo in the URL in this older version of the proposal. It should be MEC-RT, not MCE-RT.

adrianco commented 1 year ago

The contacts that I have shared this with before releasing publicly via GSF are: AWS: Senior Principal Engineer Ryan Eccles, Product Manager Jon Randelman, and Ryan is going to discuss this with engineering VP Laura Grit. [Kara Hurst is the over-all VP of Sustainability and has not been contacted directly]. Azure: Jason Oppler, GM Cloud Advocacy Patrick Chanezon, Thomas Lewis. [Melanie Nakagawa is the over-all VP of Sustainability and has not been contacted directly]. GCP: Jen Bennett - Office of the CTO. [Kate Brandt is the over-all VP of Sustainability and has not been contacted directly]. Harness.io: Harish Doddala, CEO Jyoti Bansai approved the updated quote. CloudZero: Erik Petersen CEO Salesforce: I have some contacts at Salesforce, will reach out soon [Patrick Flynn VP Sustainability has not been contacted].

adrianco commented 1 year ago

Thoughts on the name. I wanted Real Time in the name, as there are lots of carbon APIs and standards out there, but not many that are focused on providing real-time info minute by minute. I came up with MEC-RT in part so that the same schema could be aggregated into a monthly report as MEC-Monthly - which would be a superset of the current info cloud providers supply. Metrics Energy Carbon - Energy first, then Carbon is derived from the Energy. Anyway if people have other ideas for names, we should discuss them and pick one to run with.

adrianco commented 1 year ago

Relationships to other work. MEC-RT would supply actual data needed to measure SCI on cloud providers, that isn't available by any other mechanism. MEC-RT would supply actual data that is being modeled and estimated by the open source Cloud Carbon Footprint Tool https://www.cloudcarbonfootprint.org MEC-RT would supply actual data that is being modeled and estimated by Boavizta https://doc.api.boavizta.org MEC-RT would supply actual data that is being modeled and estimated by Trycarbonara https://trycarbonara.github.io/docs/dist/html/index.html

ciril-emaps commented 1 year ago

Hi @adrianco, would love to be involved in this, there might be opportunities to reach out to our network of stakeholders that are very relevant for this.

rootfs commented 1 year ago

@seanmcilroy29 @adrianco this is an exciting effort, would love to see cloud providers are onboard. Would you ab available to join Kepler community meeting and share your insight? Please find the meeting schedule here.

adrianco commented 1 year ago

@seanmcilroy29 @adrianco this is an exciting effort, would love to see cloud providers are onboard. Would you ab available to join Kepler community meeting and share your insight? Please find the meeting schedule here.

Thanks, yes, I'm on pacific time so the July 4th Kepler meeting would be at 5am for me, but July 18th at 5pm looks good.

rootfs commented 1 year ago

@seanmcilroy29 @adrianco this is an exciting effort, would love to see cloud providers are onboard. Would you ab available to join Kepler community meeting and share your insight? Please find the meeting schedule here.

Thanks, yes, I'm on pacific time so the July 4th Kepler meeting would be at 5am for me, but July 18th at 5pm looks good.

Sounds great! I'll add this topic to the July 18th agenda.

adrianco commented 1 year ago

I just spoke with Carbonara, and encouraged them to join GSF. They seem to be well aligned with this effort.

rootfs commented 1 year ago

I just spoke with Carbonara, and encouraged them to join GSF. They seem to be well aligned with this effort.

That's cool! We did some PoC using Carbonara data with Kepler to plot the workload level CO2 intensity.

adrianco commented 1 year ago

Real quote: “Sustainability has long been a concern for cloud engineering teams. But for as long as it’s been on engineers’ minds, the missing link in making sustainability a non-functional requirement has been the data. Every engineering decision is a buying decision — and consequently, an emissions decision — but without real-time data on cloud infrastructure’s cost and carbon consequences, engineers haven’t been able to prioritize efficiency as they build. MEC-RT is a crucial step in establishing a universal definition of cloud sustainability; now it’s up to organizations to quantify and optimize their cloud efficiency in the name of sustainability — an existentially urgent concern for all of us.” — Erik Peterson CTO and Founder, CloudZero

PindyBhullar commented 1 year ago

Following the meeting with 6th July 2023 - UBS will be happy to collaborate on this project proposal.

adrianco commented 1 year ago

Kepler metrics are counters of the joules of energy for different components of the system. https://sustainable-computing.io/design/metrics/

adrianco commented 1 year ago

AWS CloudWatch Metrics are defined here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/viewing_metrics_with_cloudwatch.html - the goal of MEC-RT is to have additional metrics for AWS, Azure and GCP for energy and carbon alongside these existing metrics.

adrianco commented 1 year ago

Beyond the three cloud providers mentioned, who else should we be talking to, and who are the contacts? e.g. AWS - not GSF member - Sr Principal Engineer Ryan Eccles, DE Laura Grit Azure - GSF Patrick Chanezon, Thomas Lewis - Jason Oppler/Product GCP - GSF Savannah Goodman - Jen Bennett/Office of the CTO Oracle Cloud IBM/RedHat CloudFlare DigitalOcean OVH ...

ciril-emaps commented 1 year ago

Hi Adrian, I thought it would be useful to reach out to Max Schulze, who is the founder of the SDIA.

adrianco commented 1 year ago

Hi Adrian, I thought it would be useful to reach out to Max Schulze, who is the founder of the SDIA.

Thanks!

I looked at the SDIA site, and it seems to have lots of orgs signed up but the Digital Measurement section looked pretty bare. We need to start a list of supporters, including Max/SDIA, once we get going on the project.

tmcclell commented 12 months ago

Proposal for launch incubation project, assignment of PM (Sean), and review budget proposal in the future. Recommendation of submitting project with new format retroactively. Approved by OC.

adrianco commented 12 months ago

The OpenMetrics definition is nicely formalized. I think we should produce an OpenMetrics schema as the portable baseline, and then adapt it to services like CloudWatch etc. https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md

Henry-WattTime commented 12 months ago

Next steps/workplan from WG meeting: -Setup repo for project

-get people who want to consume the API, understand their needs -Prior art, what other metrics have been exposed in the past -What's the natural way for this playing out -'open metrics' probably the best format for this, best export standard, very portable, lots of people understand and can use it -Define an open metric standard for -units for each metric -build versions of that with cloud vendors (microsoft, amazon, google, etc.)

-CNCF carbon metric tool meeting next week, kepler team? -contacts for other cloud vendors -Track interest to convey need and approach -Talk to GSF teams to ensure on board

adrianco commented 12 months ago

Discussion group started https://github.com/orgs/Green-Software-Foundation/discussions/34

adrianco commented 12 months ago

There will be a meeting with the Kepler community on July 18th 2023 https://github.com/sustainable-computing-io/community/blob/main/community-event.md - this isn't at a good time for Europe, so hopefully it will be recorded.

rootfs commented 12 months ago

@adrianco yes, the meetings are recorded.

ciril-emaps commented 12 months ago

@rootfs would you be able to post the link of the recording in this issue, once it is uploaded?

rootfs commented 12 months ago

@ciril-emaps sure thing. Alternatively, please also find all the meeting notes and previous recordings here

adrianco commented 11 months ago

The official GSF URL for the project is here https://github.com/Green-Software-Foundation/real-time-cloud - we can decide later on what the right name for the standard will be, but Real Time Cloud is the most succinct way to scope this work.

rootfs commented 11 months ago

Thank @adrianco for sharing this proposal to the Kepler community. The meeting recording is still being processed but can be found here when ready. Meeting minutes can be found here

bertysentry commented 11 months ago

@adrianco Metrics should be defined as semantic conventions in OpenTelemetry (not OpenMetrics, its sort-of predecessor in the effort to create an open standard for observability).

OpenTelemetry is supported by all major cloud providers (AWS, Azure, and GCP notably).

More:

Sealjay commented 11 months ago

How does this relate to the new ESG lake release from Microsoft, which also releases a common data model? https://learn.microsoft.com/en-us/industry/sustainability/project-esg-lake-overview#data-model

seanmcilroy29 commented 10 months ago

Project approved - Issue can be closed