Green-Software-Foundation / if

Impact Framework
https://if.greensoftware.foundation/
MIT License
139 stars 40 forks source link

Redesign sci-embodied builtin #917

Closed jmcook1186 closed 2 weeks ago

jmcook1186 commented 1 month ago

What Rebuild our sci-embodied plugin with new logic

Why

To generalize our embodied carbon calculator and make it flexible to new data. Using our default vbalues mirrors the CCF methodology here: https://www.cloudcarbonfootprint.org/docs/embodied-emissions

Context

The general methodology is borrowed from cloud carbon footprint. The logic is a follows:

1) assume a baseline embodied carbon for a hypothetical "standard" minimal rack server 2) get real server specs using a csv lookup (i.,e. instance-metadata) 3) for every additional unit of each of five component types (DRAM, CPU, HDD, GPU, SSD) add some fixed amount of carbon to the total emissions 4) scale that total embodied carbon by the portion of the device lifespan that can be attributed to your application (100% for a dedicated server)

The actual plugin will be very simple - it will have default values for the amount of carbon to add for each component type - these can be overridden using optional config. The required input parameters will be the units of each server component that are in the real server being measured.

The baseline server:

1 CPU 16GB memory 0 HDD 0 GPU 0 SSD

has an embodied carbon value of 1000 kgCO2eq (i.e. 1000000gCO2eq)

If we run instance-metadata on a real server and find that it has

3 CPU 32GB memory 1 HDD 1 GPU 1 SSD

then we will do the following:

For the CPU:

actual - baseline = difference in units difference in units * emissions per addiitonal unit

difference = 3-1 = 2 difference * 100000 = 200000 so this server has an additional 200000g on top of the baseline due to the CPU only.

For the Memory:

actual - baseline = difference in units difference in units * emissions per additional unit

In this case, the given constant is 150000gCO2eq per 128GB RAM, but we are only adding 16 on top of the baseline, so we need to do some scaling too.

Actual - baseline = 16GB of additional memory Additional emissions = (150000/128)*16 = 18750.0 gCO2eq

For the HDD:

Add 1 x the HDD constant

= 1* 50000 g CO2eq

For the SSD:

Add 1 x the SSD constant

= 1* 100000 g CO2eq

For the GPU:

Add 1 x the GPU constant

= 1* 150000 g CO2eq

Total embodied:

Then add the sum of all these additional CO2 emissiosn to the baseline, so the final calculation is

1000000 + 200000 + 150000 + 18750 + 50000 + 100000 + 150000 = 1668750 g CO2eq

Worked example

i want to know the embodied carbon for an Azure instance. I first configure a csv-lookup plugin to retrieve:

vCPUs memory SSDs HDDS GPUs

from a sheet like this one: https://docs.google.com/spreadsheets/d/1k-6JtneEu4E9pXQ9QMCXAfyntNJl8MnV2YzO4aKHh-0/edit?gid=0#gid=0

Let's say my search parameter was A4m v2, and I got the following data:

vCPUs: 4
Memory: 32
SSDs: 0
HDDS: 0
GPUs: 0

I then pass those values to the new plugin to return an embodied carbon value for the server.

It is

1000000 + 300000 + 18750 = 1318750 gCO2eq

Then I have the observations for my running application. These should be used to scale the total embodied carbon for the server to that portion I am responsible for according to the ratio of time my app is running to the total lifespan of the server.

The allocated vCPUs is that value we just grabbed from the csv, the total cpus is the maximum available for that family of instance, the lifespan is assumed to be 4 years and the time allocated is the duration of each timestamp. This gives an embodied carbon value per timestap that can be aggregated across time and components in your manifest.

Let's say the max vCPUs for the A2 instance family is 8, the lifespan (4y in s) is 126144000, the duration of a timestamp is 3600.

we take our 1318750 gCO2eq and then:

embodied * (vcpus/total-vcpus) * (time / lifespan)

=

1318750 *  (4/8) * (3600/126144000)

=

1318750 * 0.5 * 2.8538812785388127e-05

= 18.8 g CO2eq in 3600 s timestep

Plugin boundary

Let's do all the calculations in a single plugin. The plugin does NOT have to do any data gathering - we can assume the necessary values are going to be passed into the plugin as arguments (i.e. the csv lookup is out of scope) but the plugin should take the instance metadata and return embodied carbon scaled by time individual timesteps.

Inputs

The plugin will accept a relatively large set of inptu values, but many will be optional. In fact, many will be only occasionalyl used, because they are there to give the option to override default values, which few people will do, at least in the medium term.

The supported values are:

**these should come from input data**:

vCPUs: optional, default is 1
memory: optional, default is 16 GB 
ssd: optional, default is 0
hdd: optional, default is 0
gpu optional, default is 0
total-vcpus: optional, default is 8

**these should be global config**:

baseline-vcpus: 1
baseline-memory: 16
lifespan: optional, default is 126144000 seconds
time: optional, default is to use `duration`
baseline-emissions: optional, default is 1000000 gCO2eq
vcpu-emissions-constant: optional, default is 100000 gCO2eq per vcpu
memory-emissions-constant: optional, default is 1172 gCO2eq/GB ( converted from 150000gCO2eq per 128 GB)
ssd-emissions-constant: optional, default is 50000 gCO2 eq per ssd
hdd-emissions-constant: optional, default is 100000 gCO2eq per hdd
gpu-emissions-constant: optional, default is 150000 gCO2eq per gpu

Calculation

Then the following calculation should take place for each timestamp in the input array:

embodied-carbon-per-timestep = 

(
baseline-emissions + 
((vcpus - baseline-vcpus) * vcpu-emissions-constant) +
((memory - memory-baseline) * memory-emissions-constant) +
ssd * ssd-emissions-constant +
hdd * hdd-emissions-constant +
gpu * gpu-emissions-constant
)
* (vcpus/total-vcpus) * (time / lifespan)

Prerequisites/resources n/a

SoW (scope of work)

Acceptance criteria

Scenario 1: No input config

Given (Setup): The plugin exists and works as described above

When (Action): i execute the followijng manifest (note no input values - falling back to defaults for everything!)

name: embodied-carbon demo
description:
tags:
initialize:
  plugins:
    embodied-carbon:
      method: SciEmbodied
      path: builtin
      global-config:
        output-parameter: "embodied-carbon"
tree:
  children:
    child:
      pipeline:
        compute:
          - embodied-carbon
      inputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
        - timestamp: 2023-08-06T10:00
          duration: 3600

Then (Assertion): I get the following output:

name: embodied-carbon demo
description:
tags:
initialize:
  #
  plugins:
    embodied-carbon:
      method: SciEmbodied
      path: builtin
      global-config:
        output-parameter: "embodied-carbon"
tree:
  children:
    child:
      pipeline:
        compute:
          - embodied-carbon
      inputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
        - timestamp: 2023-08-06T10:00
          duration: 3600
      outputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
          embodied-carbon:  3.567351598173516
        - timestamp: 2023-08-06T10:00
          duration: 3600
          embodied-carbon: 3.567351598173516

Scenario 2: overriding all config

Given (Setup): The plugin exists and works as described above

When (Action): i execute the following manifest

name: embodied-carbon demo
description:
tags:
initialize:
  plugins:
    embodied-carbon:
      method: SciEmbodied
      path: builtin
      global-config:
        baseline-vcpus: 1
        baseline-memory: 16
        lifespan: 157680000 
        baseline-emissions: 2000000
        vcpu-emissions-constant: 100000
        memory-emissions-constant: 1172
        ssd-emissions-constant: 50000
        hdd-emissions-constant: 100000 
        gpu-emissions-constant: 150000
        output-parameter: "embodied-carbon"
tree:
  children:
    child:
      pipeline:
        compute:
          - embodied-carbon
       defaults:
          - vCPUs: 4
          - memory: 32
          - ssd: 1
          - hdd: 1
          - gpu: 1
          - total-vcpus: 16
      inputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
        - timestamp: 2023-08-06T10:00
          duration: 3600

Then (Assertion): I get the following output:

name: embodied-carbon demo
description:
tags:
initialize:
  plugins:
    embodied-carbon:
      method: SciEmbodied
      path: builtin
      global-config:
        baseline-vcpus: 1
        baseline-memory: 16
        lifespan: 157680000 
        baseline-emissions: 2000000
        vcpu-emissions-constant: 150000
        memory-emissions-constant: 1172
        ssd-emissions-constant: 150000
        hdd-emissions-constant: 150000 
        gpu-emissions-constant: 150000
        output-parameter: "embodied-carbon"
tree:
  children:
    child:
      pipeline:
        compute:
          - embodied-carbon
       defaults:
          - vCPUs: 4
          - memory: 32
          - ssd: 1
          - hdd: 1
          - gpu: 1
          - total-vcpus: 16
      inputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
        - timestamp: 2023-08-06T10:00
          duration: 3600
      outputs:
        - timestamp: 2023-08-06T00:00
          duration: 3600
          embodied-carbon: 16.65
        - timestamp: 2023-08-06T10:00
          duration: 3600
          embodied-carbon: 16.65 
zanete commented 3 weeks ago

expecint a PR after the #977 and creating a new IF core release

zanete commented 2 weeks ago

it's good to review cc @jmcook1186 @manushak