At the moment information on the job is passed to cats via the jobinfo cli argument:
partition=CPU_partition,memory=8,ncpus=8,ngpus=0
Information relating to hardware is assumed to be specified in the config file, e.g.
PUE: 1.20 # > 1
partitions:
CPU_partition:
type: CPU # CPU or GPU
model: "Xeon Gold 6142"
TDP: 9.4 # in W, per core
After looking closely at carbonFoootprint.py, I think the information required to estimate the carbon footprint boils down to the number of devices and their power consumption.
This PR is about simplying the configuration file and its processing, with the intent of simplfying the carbonFootprint.py module downstream. The suggested configuration structure is
location: "EH8"
api: "carbonintensity.org.uk"
PUE: 1.20 # > 1
profiles:
CPU_partition: # Arbitrary name for first profile. First profile is also the default profile
cpu:
model: "Xeon Gold 6142"
power: 9.4 # in W, per core
nunits: 2
GPU_queue:
gpu:
model: "NVIDIA A100-SXM-80GB GPUs"
power: 300
nunits: 2
cpu:
model: "AMD EPYC 7763"
power: 4.4
nunits: 1
You can then specify the profile to use for the footprint estimation at the command line. The footprint estimation is activated
using the --footprint flag.
$ cats -d 180 --footprint -p GPU_queue --memory 8
$ cats -d 180 --footprint --memory 8# Use default profile, i.e. first profile in config
$ cats -d 180 --footprint --profile CPU_partition --cpu 8 -- memory 8 # Override config to specify 8 cpus instead of 2
The memory footprint must be specified at the command line.
The job info is processed from configure.get_job_info which returns a list of tuples (nunits, power) with one element per power-consuming device. So if you have 2 CPUs, 4 GPUs and 8GB of memory, the jobinfo is, assuming 0.4 W/GB for memory:
[(2, 9,4), (4, 300), (8, 0.4)]
Currently the jobinfo list returned by configure.get_runtime_config is not used, and args.jobinfo still is. This is contributed in a subsequent PR (see #80 ) in order to limit the amount of changes contributed.
At the moment information on the job is passed to
cats
via thejobinfo
cli argument:Information relating to hardware is assumed to be specified in the config file, e.g.
After looking closely at carbonFoootprint.py, I think the information required to estimate the carbon footprint boils down to the number of devices and their power consumption.
This PR is about simplying the configuration file and its processing, with the intent of simplfying the carbonFootprint.py module downstream. The suggested configuration structure is
You can then specify the profile to use for the footprint estimation at the command line. The footprint estimation is activated using the
--footprint
flag.The memory footprint must be specified at the command line.
The job info is processed from
configure.get_job_info
which returns a list of tuples(nunits, power)
with one element per power-consuming device. So if you have 2 CPUs, 4 GPUs and 8GB of memory, the jobinfo is, assuming 0.4 W/GB for memory:Currently the
jobinfo
list returned byconfigure.get_runtime_config
is not used, andargs.jobinfo
still is. This is contributed in a subsequent PR (see #80 ) in order to limit the amount of changes contributed.