Green-Software-Foundation / if

Impact Framework
https://if.greensoftware.foundation/
MIT License
149 stars 40 forks source link

Carbon QL Design spec #38

Closed srini1978 closed 1 year ago

srini1978 commented 1 year ago

CQL Design

There are two components to CQL. The application framework - calls the carbon calculation extendable framework, glues inputs to outputs etc…. This is the ontology project. The Carbon calculation extendable framework (CCEF) or carbon framework - given some inputs, outputs some estimates of emissions data.

The Application framework is explained well in the SCI ontology project requirements specifications. The output from the Application framework could be a JSON that can be passed as input to the carbon Model.

What we see below are the specifications of the carbon framework.

High level architecture

Conceptually the high level architecture of how the application model and carbon model are connected is shown below.

The software boundary as defined in the SCI spec can be modelled using the Application model or SCI ontology project. This consists of the infrastructure like Virtual machines, databases, message queues, api gateways, network load balancers etc.

For the specific infrastructure being modelled, corresponding carbon models are chosen. The Ontology project can then pass the relevant operational telemetry to the carbon model as JSON or a collection of parameters. The model calculates the emissions based on the SCI spec and returns the values.

image

Carbon Models - Design principles

The intent of a Carbon model in the architecture is to be able to create a platform for onboarding models that give carbon emission output values and to be able to standardize the interface for calling systems/users. Users can easily compare emission values across models using the platform.

The underlying theme is that : we wouldn't be too worried about the specific nuances of the actual services (serverless, Managed Services etc) . This responsibility of defining the infrastructure is offboarded as a responsibility to the Application model. Rather given an infrastructure blueprint for the software and given that the specific blueprint can be calculated using a specific model metadata, we will focus on using the energy coefficients provided by the models to make quick calculations.

The core working/calculations of the code would not be that important, the most important is that it exposes and adheres to the same interface as all the other CQL carbon models. Like the one we discussed here:

https://github.com/Green-Software-Foundation/carbon-ql/issues/31

Interface Design

There are different carbon models available - Etsy, CCF, Boazvita. Each of these models come up with a guidance and usage around how to use the given tool to get emissions values.

We leverage existing work in this space and intent will be not to re-write them in our framework but to create an interface wrapper which calls existing code but normalises all the interfaces to the same one so we can swap/compare. Some of the sample methods that will be standardised in the interface across all models are :

Create - create an instance of the backend model _Dispose - _dispose the instance History - returns the historical emissions values for the given time period Snapshot - returns the emissions values at a single point in time. Stream - returns real time emissions.

Communication

The other aspect is how those functions communicate back to the caller when things go wrong. There should be standard error codes that will be returned that helps debug the issue. So that when someone like a developer, a pseudo-etsy coder, builds the interface they not only adhere to the function interface but they also adhere to a common approach to return errors.

List of Carbon Models

Etsy

The Etsy model has been made by producing standard conversion factors from CPU hours or Terabyte Hours (depending on whether it is compute or storage) to Watt hours. These conversion factors are called Cloud Jewels . The conversion factor has been arrived at by looking at utilization % numbers(CPU and storage) and the corresponding hardware spec related to the utilization (got from SPEC power report) and using a formula to convert this into Watts.

Average watts = Min Watts + Avg vCPU utilization *(Max watts – Min watts)

Compute Watt-hours = Average watts * vCPU hours.

Also, Etsy can be primarily used in GCP cloud migration scenarios only.

Pros

Cons:

Interface design for Etsy : For Etsy they already have some code here: https://github.com/etsy/cloud-jewels I was imagining we (or Etsy!!!) creates a wrapper for their existing code which exposes an interface which adheres to the CQL standard.

image

Cloud Carbon Footprint.

CCF is an extension of Etsy and they have additionally included networking and RAM. However, the model is more complicated. They have provided formulas for calculating energy usage across the different compute services in AWS, Azure and GCP and the formulae is different between serverless services, managed services and VMs.

The model is based on billing usage, and they connect with the respective billing APIs and grab the VCPU hours from there. This is then multiplied using the same formula as Etsy.

One difference between Etsy and CCF w.r.t the above formula is that they have called out exclusion scenarios where the above formula cannot be used – GPUs, AWS Lambda services,AWS Aurora serveless services, GKE engine.

There are more nuances to the model, however one good thing is that they have provided what are called energy coefficients that can be used to multiply with our Telemetry object of the model. This may not be accurate, but it may provide us with a starting point. The link to the place where they provide the co-efficients is here https://www.cloudcarbonfootprint.org/docs/methodology#appendix-i-energy-coefficients. We can substitute the utilization value with actuals to increase accuracy.;

Methodology | Cloud Carbon Footprint

Summary www.cloudcarbonfootprint.org

We would use the same concept for CCF, there needs to be a CQL CCF interface. I don’t think they have an SDK to interact with CCF, it’s an application rather than a library. So in this case we either write our own CQL module from scratch, or encourage ThoughtWorks to do so.

image

Carbon Aware SDK (Watt Time/ Electricity Maps)

Carbon Aware SDK provides an API and a command line interface that helps consumers get carbon intensity data for the electricity grid where the software is running. It internally connects to either Watt Time or Electricity maps as a data provider and is able to provide carbon intensity values .

For the carbon aware SDK, we can build a CQL interface. The CQL interface can interact with the carbon aware SDK Web API interface and pass in the region where the software is being run and the time of the day and get the emissions data. There are other methods that can be used like forecasts, average carbon intensity, carbon intensity values in batch etc that can be leveraged as needed.

In Addition, we will also build a CQL interface that can interact directly with the backend data providers - Watt Time and Electricity maps without having the need to call the carbonaware SDK.

image

As mentioned below in the interface specification, all input parameters will be passed as dictionary objects. Hence to call the carbon aware SDK, we will be passing the following parameters or a combination of them as input params: Locations,Time Boundary.

Types of Carbon Models

Thinking through some of the types of carbon models right now I can think of 3 categories.

API Carbon Models

E.g. a Boavista Carbon Model will probably just call the Boavista API. It will translate CQL inputs into the format the Boavista API needs and transform/translate the response into a CQL format.

This might be how Intel creates its Carbon Model. We wouldn’t hardcode our powercurve/coefficients in code but hide it behind an API so that (a) we can update the numbers easily (b) make sure only customers who’ve signed an NDA have access to it.

Algorithmic Carbon Models

This is the type you wrote above, hardcoded coefficients with some simple algorithmic logic.

Lookup Carbon Models

This is more like perhaps a CCF model. It would first load up some files (coefficients) across the network (not hardcode them in) and then translate inputs/outputs with the CQL interface.

Wrapper Carbon Models

E.g. If there is an existing library somewhere like the Etsy Cloud Jewels then we might just create a wrapper Carbon Model that just calls that library and translates the inputs/outputs to the CQL interface.

Programming Language

Another point I’ve been thinking about is the language support for CQL. Personally I’d really love if anyone in any language can just type a command like this:

pip install cql-etsy npm install cql-etsy nuget install cql-etsy

And they have access to the CQL model. If we limited the whole of CQL to one language we’d drastically reduce our user base. From experience of CA SDK and their use of Swagger to solve this same problem, with swagger you solve the problem of multi language support but make it so much harder to set up that people don’t bother. In the hackathon the majority of users didn’t bother and just called the API directly.

For the launch/hackathon I think we should target Python + JavaScript that covers the broadest market (see this). I.e. every CQL model interface needs to be available in Python + JavaScript.

It’s possible to do that with the core model being written in a native system language like C/C++/Rust and then a wrapper module in Java, Python, JavaScript that calls that core native module. E.g. with Java you use JNI, with JavaScript you use he Native Modules feature, Python has similar functionality.

Approach - Version 1 of carbon QL

Since the carbon aware SDK model is mostly well defined we will build the first carbon QL interface to integrate with the carbon aware SDK model. Build the interface in python that can get installed through the command line.

The package should be able to accept parameters through the command line that can help

  1. Accept the models as parameters that can be instantiated - e.g Etsy, CCF,carbon Aware SDK, Wattime, Electricity Maps
  2. Accept dictionary parameters
  3. Provide a help message that details out the usage of the carbonQL interface including the syntax
  4. Provide appropriate error codes that can help debug the error.

Detailed Design

We have tried to depict how the carbonQL will be called by the SCI ontology project below. The facade layer can be called by the SCI ontology project by passing the model name and then the model specific methods and/or parameters (either as JSON or dictionary parameters).

image
srini1978 commented 1 year ago

@jawache @navveenb @Oleg-Zhymolokhov

jawache commented 1 year ago

FYI @srini1978 keeping this isssue open there are still things I think we need to migrate over to our markdown spec.

catblade commented 1 year ago

Are you looking at all at adding networking to this project? That is still an outstanding item, that should probably be looked at at some level.

jawache commented 1 year ago

@catblade yes networking will be supported (through model plugins), in fact everything is a plugin, this framework just calls plugins that adhere to the same spec so we can treat them all the same and connect them all together.

jawache commented 1 year ago

Closing this as all of this has either been incorporated or superceded by the latest spec