We created a new class ZeusProfilingWindow. It profiles energy and time profiling on arbitrary windows inside training scripts. Meanwhile, multiple profile windows are stored in a FIFO stack, which allows simultaneous profiling at different levels during the training.
We integrated ZeusProfilingService into ZeusDataLoader. This abstracted away the logic of computing time and energy consumption.
ZeusProfilingService will decide the energy query method in a per-GPU manner. nvmlDeviceGetTotalEnergyConsumption will be used for GPUs architectures Volta and later; power polling with Zeus Monitor will be used for GPUs with older architectures.
In response to #12, this PR uses
nvmlDeviceGetTotalEnergyConsumption
for GPU architectures Volta and later.Main Changes and Benefits
ZeusProfilingWindow
. It profiles energy and time profiling on arbitrary windows inside training scripts. Meanwhile, multiple profile windows are stored in a FIFO stack, which allows simultaneous profiling at different levels during the training.ZeusProfilingService
intoZeusDataLoader
. This abstracted away the logic of computing time and energy consumption.ZeusProfilingService
will decide the energy query method in a per-GPU manner.nvmlDeviceGetTotalEnergyConsumption
will be used for GPUs architectures Volta and later; power polling with Zeus Monitor will be used for GPUs with older architectures.