FNNDSC / miniChRIS-docker

Easiest way to run ChRIS locally.
MIT License
7 stars 25 forks source link

Environmental impact of ChRIS and ChRIS plugins #13

Open jennydaman opened 1 year ago

jennydaman commented 1 year ago

This issue is not really related to miniChRIS, but is put here for consolidation with other issues with the "observability" label.

It would be cool if we could somehow measure the environmental impact of ChRIS and ChRIS plugins. I got this idea while talking to members of OHBM SEA-SIG at OHBM BrainHack 2022.

Big data and cloud computing has a negative environmental impact. The easiest aspect of this to quantify would be the estimated "carbon emissions" caused by the energy used by computers.

https://codecarbon.io/ is a method to measure to estimate carbon emissions caused by Python programs. It might be useful.

Where in the architecture can we implement this? Ideally, we would do it at the level of the container runtime (runc, crun, apptainer) and then have this data be sent to CUBE. ChRIS_ui could display the information and guilt-trip the user, like how Google Flights does:

Screenshot of Google Flights

A much easier but unscalable proof-of-concept would be to create a ChRIS plugin that uses codecarbon.

FaithKovi commented 1 year ago

Hello @jennydaman I am an outreachy applicant. I want to work on this issue.

jennydaman commented 1 year ago

@FaithKovi I'm excited to hear your ideas, please be in touch!

FaithKovi commented 1 year ago

Sure @jennydaman

FaithKovi commented 1 year ago

Tracking Carbon Emissions Using CodeCarbon

CodeCarbon is a tool that measures your carbon footprint. It comes as a lightweight pip package that can seamlessly integrate with Chris because of its python codebase

Factors that have an impact on the rate of transmission of carbon.

Grid energy mix: The combination of different energy sources that generates electricity from the grid the hardware infrastructure is connected to causes variation in average emissions in a single region. To limit environmental impact, choose cloud server regions with low emissions. Compute time: Computational intensive program Choice of hardware: Using new generations of computing hardware like GPUs and tensor processing units(TPUs) which have been designed for parallel computations improves efficiency.

How does CodeCarbon work?

The amount of power used by underlying infrastructure from cloud providers and on-premise data centers is recorded. It refers to the carbon intensity of the grid energy mix the hardware is connected to estimate the amount of C02 emissions produced. This tool has a dashboard showing the difference in the current emissions and when cloud infrastructure is hosted in different regions.

How to reduce Chris's Carbon Footprint

Choosing efficient model training practices(developers can fine-tune pre-trained models rather than doing it from scratch) Choosing a cloud provider region that has low-carbon electricity usage. Using random search rather than the grid search approach for better efficiency and lesser net training time of deep learning models This will reduce the impact of Chris on the climate and introduce transparency.

There are 3 different ways to use codecarbon:

  1. As an object
  2. As a decorator
  3. As a context manager All 3 can be combined to give a broad overview

Installation of codecarbon - https://github.com/mlco2/codecarbon OPERATING SYSTEM: I used WSL2 (ubuntu)

ERROR ENCOUNTERED: When installing the scripts, they might be installed in a location that is not in path. Ensure to look out for this warning and add that location to path.

Code carbon supports both offline and online modes.

To answer your question on where in the architecture this can be implemented, based on my research on how codecarbon is implemented/used it is implemented where GPU-intensive training codes sit. Another implementation can be via the comet integration - https://github.com/mlco2/codecarbon#comet-integration Data sources - https://github.com/mlco2/codecarbon#data-sources-%EF%B8%8F

TRY OUT: I tried creating a codecarbon plugin using the Chris plugin template( https://github.com/FNNDSC/python-chrisapp-template ). This is my repo which I am still working on - https://github.com/FaithKovi/chris-codecarbon-plugin

I got stuck on how to check if the plugin actually works or if it there are steps I missed. I would really appreciate your help here

Questions

  1. Where can I find a sample Chris training code to try out the Codecarbon pip package?
  2. The output after using codecarbon is stored in a .csv file. Can I get resources on how data can be sent to CUBE?
  3. I am having issues with the github workflows. The build and test jobs keep failing.

References

https://medium.com/bcggamma/ai-computing-emits-co%E2%82%82-we-started-measuring-how-much-807dec8c35e3 https://codecarbon.io/ https://www.madetech.com/blog/carbon-footprint-python-applications/ https://mlco2.github.io/codecarbon

@jennydaman