Green-Software-Foundation / hack

Carbon Hack 24 - The annual hackathon from the Green Software Foundation
https://grnsft.org/hack/github
15 stars 2 forks source link

Create a model plugin to calculate the carbon emissions of Neural Networks training and inference #118

Open teto1992 opened 8 months ago

teto1992 commented 8 months ago

Prize category

Best Plugin

Overview

The idea is to generate a model that could be used to calculate the carbon emissions produced during neural network training and inference phases, also accounting for embodied carbon emissions

Questions to be answered

No response

Have you got a project team yet?

Yes and we are still open to extras members

Project team

@SpanishInquisition49 @WoralQuaz

Terms of Participation

rachaelcodes commented 8 months ago

Hi, this looks like an interesting project. What sort of skills are you looking from for potential collaborators?

samuBe commented 8 months ago

Would be interested in this About me:

teto1992 commented 7 months ago

Project Submission Summary

This project aimed at implementing a set of plugins to calculate the carbon emissions produced during neural network (NN) training and inference phases, accounting for embodied carbon and carbon intensity of different energy mixes powering up the different Cloud facilities at hand. Overall, we implemented 4 plug-ins inspired by the state of the art, to estimate:

Problem

The training of neural networks (NN) occurs distributedly among thousands of servers with GPUs/TPUs, requires processing large amounts of data, and can take up to more than a month to complete. NNs are then deployed to Cloud datacenters, where they are used to continuously perform inference tasks, i.e. to reply to user queries. Datacenters involved in NN training and querying phases consume a considerable amount of electricity and can cause high greenhouse gas emissions. Those emissions very much depend on the different carbon intensities of the energy mixes that power up those distributed facilities. Last, hardware for training and querying (which is regularly updated by Cloud providers) comes with embodied emissions that further increase those figures.

The problem we consider is how to holistically estimate the environmental footprint of NNs, also considering embodied carbon emissions and the possibility of different datacenters being powered up by different energy mixes. Particularly, our plugins answer the following questions:

Application

nn-emb

This plugin estimates embodied carbon emissions for either the training or the querying phases as follows.

$Cemb = \sum_{h \in H_t} \frac{\alpha_h}{\tau_h} E_h Count(h)$

where:

nn-et

This plugin estimates the energy consumption of the training phase for one datacentre as follows.

$Et = |Ht| Pt \Delta t * PUE$

Where:

nn-eq

This plugin estimates the energy consumption of the querying phase for one datacentre analogously to nn-et, with $\Delta T$ deployment time and without considering GPUs/TPUs.

nn-c

Let DC be the set of datacentres used with their associated average carbon intensities and energy transport factors. This plugin relies on nn-et and nn-eq to estimate the carbon emissions of the training and querying phases as follows:

$C = \sum_{d \in DC} E(d) * \alpha_d / \tau_d$

Note that $E(d)$ can be either related to querying or to training. The project then exploits if/plugins/sum to combine nn-emb and nn-c to estimate the overall NN carbon emissions.

Prize category

Our project may positively impact the sustainability movement by increasing awareness among all stakeholders (i.e. DevOps and final users) on the energy consumption and greenhouse gases emissions related to NN training and usage. DevOps maintaining NN-based systems (e.g. LLMs) can rely on our output to optimise their system by performing what-if-analyses to informedly decide, e.g., which cloud service to use based on their hardware and/or energy quality, how to tune their model and hyper-parameters, which algorithms to use. Adopting this model allows, in turn, final users to make informed decisions on which NN-based system to use for their work – based on a clear sustainability assessment.

For the above to happen, our model calls hardware manufacturers, Cloud providers, energy power providers, and NN-based systems providers to release data about the footprint of their assets. Our model works across different Cloud providers – powered up via different energy mixes – and featuring heterogeneous hardware capabilities, which enables it to measure a plethora of different software ecosystems and environments. The set of plugins has been implemented according to the UNIX philosophy and the IF micro-model architecture and composes a pipeline of plugins, where each estimates one quantity in the NN lifecycle, also relying on the default sum plugin.

Video

Video presentation

Artefacts

Code repository

Usage

Link to usage instructions if applicable.

Process

We have developed the process in an Agile manner by first analysing the existing literature in the field of estimating the environmental footprint of modern NN-based systems. Then, we came up with a holistic model to assess energy consumption and carbon footprint during both the training phase and the querying phase. In parallel, we got acquainted with the IF framework and how to extend it by reading its documentation. Last, we had a couple of sprints to implement the first version of our plugins.

Inspiration

Our solution was inspired by different estimate models available in the literature. For the calculations of energy and carbon emissions of training and querying:

For the computation of embodied carbon emissions and the general framework:

Challenges

From a development perspective, it was initially challenging to fully understand the framework documentation; especially how the validation system works for a composite set of plugins. After we got acquainted with the philosophy underlying IF, then it was straightforward to implement our plugins and to compose them into a pipeline. It was also challenging (and interesting) to move across existing literature in the considered field of application.

Accomplishments

We are particularly proud of having implemented framework, based on existing models taken from the scientific literature, for holistically estimating the overall carbon footprint and energy consumption of systems based on NN. Indeed, thanks to its inspiration from the state-of-the-art, our set of plugins considers the whole lifecycle of NN-based systems from training throughout the deployment over Cloud servers during its usage, including embodied carbon. Last, but not least, the possibility of considering a variety of energy mixes makes our model flexible enough to consider multi-cloud deployments across different areas worldwide.

Learnings

First, we have learned how the IF works, and how we can use it to make estimates and measurements related to modern software deployment. More precisely, we have learnt what a plugin is, how to implement one, and how to compose them into a pipeline. We also learned how to estimate the energy consumption of a Neural Network, and how to transform such estimate formulas into a usable IF plugin.

What's next?

Our solution enables estimating the energy consumption and carbon footprint of NN-based systems, also considering embodied carbon. As neural networks are becoming an important part of modern software systems (e.g. with LLMs), it is crucial for IF to accommodate plugins capable of performing measurements and estimates in this field. This could potentially raise interest of new industrial and academic partners in the framework, as well as create new opportunities for enhancing it with further plugins and/or to enable new applications of the framework itself to raise awareness among software DevOps and users.