mlco2 / codecarbon

Track emissions from Compute and recommend ways to reduce their impact on the environment.
https://mlco2.github.io/codecarbon
MIT License
1.11k stars 173 forks source link

Add hardware embodied impacts #471

Open samuelrince opened 10 months ago

samuelrince commented 10 months ago

TL;DR: Currently CodeCarbon is focused on usage impacts related to energy consumption of the host. To assess impacts on a broader scope and in a multisteps way, as defined in Life Cycle Assessment studies, we need to add embodied impacts of the underlying hardware.

Description

When assessing the impacts of hardware (and thus software running on it), it is common sense to look at Life Cycle Assessment (LCA) methodologies. LCA is a multisteps and multicriteria methodology to compute environmental impacts.

Multicriteria includes impacts not only in terms of carbon emissions or energy consumption but abiotic resource depletion, water usage, biodiversity loss and so on.

Multisteps is about taking the whole life cycle into account, from resource extraction, manufacturing, transportation, usage and finally end-of-life. This is what this issue is about. We call embodied impacts what consists of the impacts before the usage part. For instance, the process of making a GPU emits greenhouse gases, and thus when you receive brand new hardware, its emissions are not zero and must be taken into account.

Why does it matter?

Embodied impacts account for hardware impacts and thus can help better answer the following types of questions:

Introducing multisteps environmental impact assessment will make CodeCarbon even more precise regarding the impacts of software and AI.

Implementation

I propose to use the open source methodology and API we develop at Boavizta (project named BoaviztAPI). The API provides proxy functions to assess embodied impacts given a hardware specifications. So, for instance, given the CPU name, the quantity of RAM, the model of GPU, etc. we can compute embodied impacts.

BoaviztAPI resources:

The hosted API endpoint https://api.boavizta.org/ is today solely for testing and not really to be used in production or for heavy usage.

Implementation using static data

One proposition was to implement embodied impacts based on BoaviztAPI using static CSV or JSON files containing typical hardware configurations and associated impacts. This solution lacks the ability to automatically benefit from next update of BoaviztAPI, but can be considered to make a POC of the feature. Or it can also be considered as a backup solution.

Implementation using a dedicated API endpoint

The official endpoint is not designed to handle heavy traffic or production use cases (no SLA or uptime guarantee). We can mitigate that by either A) hosting an API endpoint dedicated to CodeCarbon users or B) asking the end user to host it on its infrastructure to use that feature.

While A) can be discussed, I found B) less relevant, as the feature will likely be used by very few people if done that way.

In addition, A) (and probably B, depending on the use case), lacks the possibility to run CodeCarbon without an Internet connection, and this can be an issue in some cases.

Implementation using the python SDK

Finally, we can consider using the official BoaviztAPI python SDK in offline mode, which wraps the API routes in python functions that can work just like the official and online version. This is possible because BoaviztAPI has been developed since the beginning to be a stateless API.

Thus, this solution needs testing on our side (Boavizta) to make sure it's reliable and can be used in all environments where CodeCarbon is already in use. Still, this feature can be considered in the beginning as an extra requirement so that it does not block the installation of CodeCarbon on incompatible systems/environments at the beginning. And then, when we are confident enough to push that features for all users.

Caveats

For this implementation to make sense in the context of CodeCarbon, we still need to finish up the modeling and implementation of GPUs inside BoaviztAPI, which is still a work in progress. (see related issue #65)

This issue is primarily focused on opening the discussion first and then probably support the first implementation of the feature. Happy to read your comments and suggestions! 🤗

benoit-cty commented 10 months ago

Thanks ! A first implementation that could be done is to add this to the metadata area of the dashboard as it has the CPU and GPU model: image