Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
Summary:
Added the API to read pid of GPU processes, as DCGM cannot read pid properly. This will be used to identify running process -> find process workflow environment metadata
The API returns a list of pids running on the GPU, with index being the GPU id, -1 means no process is running on that GPU.
Summary: Added the API to read pid of GPU processes, as DCGM cannot read pid properly. This will be used to identify running process -> find process workflow environment metadata
The API returns a list of pids running on the GPU, with index being the GPU id, -1 means no process is running on that GPU.
Reviewed By: jj10306
Differential Revision: D41561765
LaMa Project: L1137347