Open luccabb opened 11 months ago
@luccabb what is the output of nvidia-smi? What GPU generation are you using?
@dbeer
What GPU generation are you using?
NVIDIA A100-SXM4-40GB
what is the output of nvidia-smi?
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13 Driver Version: 525.60.13 CUDA Version: 12.0
...
per https://github.com/NVIDIA/DCGM/issues/149#issuecomment-1922398817 its only available on Hopper+ GPUs
surfacing this on the dcgm docs would be helpful
cc: @dbeer @nikkon-dev
the nvidia-dcgm doc says that metrics like
DCGM_FI_PROF_NVLINK_L{id}_TX_BYTES
should be avaible on dcgm 3.1I'm getting the following error when trying to query them (from dcgmi 3.1.3):
is it expected? am I missing intermediate steps to enable the metrics?