Open kordou opened 1 month ago
Sorry about that, could you try with tracking_mode='process'
?
We support Slurm for shared machine, but if you know how to do it for RunPod, we will be happy to add it : https://github.com/mlco2/codecarbon/blob/489ba66b9358971d53b7085e864d3f5cd1251193/codecarbon/external/hardware.py#L273
Description
I am running a set of codes on the RunPod server that provides, for example, an Nvidia A100 80GB with a maximum RAM memory allowance of 100 GB. In the results, I see that the ram_total_size is 1 TB, and the energy and emissions show that the RAM has a much greater impact than the GPU, which is not normal as other researchers have found that the GPU has the main consumption. For example, I see GPU power: 102 W and RAM power: 378 W.
I ran the same codes on a Google Colab server with an A100 40GB this time and got GPU power: 50 W and RAM power: 31 W.
So my question is: Can CodeCarbon not be used in shared machines? Or is there something else we need to do in order to have proper emissions?
It took me two months of work to notice this, unfortunately...
Thank you