Closed weiji14 closed 5 months ago
Noting here that I was digging on our AWS Console and "There is a delay of three months between the end of a month and when emissions data is available". Moreinfo
We did some of that using the AWS carbon estimator tool. We documented it in the training card. This is quite superficial and if we can let's digg deeper. But for now I'll close the issue, let's open other issues with more specific tasks if appropriate.
https://clay-foundation.github.io/model/release-notes/specification.html#training-card
Training foundation models can use up a lot of energy and emit signficant amounts of carbon emissions, and we should be transparent on this, since the Clay Foundation Model has an environmental focus too.
Originally posted by @brunosan in https://github.com/Clay-foundation/model/issues/64#issuecomment-1837442185
Implementation
Tracking tools
There are tools for tracking energy usage and carbon emissions using tools like:
Selecting environmentally friendly cloud regions
Currently (as of Dec 2023), we are running our compute on AWS's
us-east-1
(North Virginia) region which has actually has a poor carbon intensity of 378gCO₂eq/kWh (averaged over the past year, see https://app.electricitymaps.com/zone/US-MIDA-PJM). We could consider other cloud regions that have a lower carbon intensity:While some of these cloud providers use carbon offsets, we can also make an active decision to run compute on regions with low carbon intensity.
Downstream use-cases
Training the initial Foundation Model is only the first part. We can also work to ensure that finetuning the model (for downstream tasks) can be made more energy efficient.
There are several ways to do this, but I'll just point out Parameter Efficient Fine-tuning methods, and also the work done by MIT Han Lab on Efficient AI computing.
Further reading