NVIDIA / DCGM

NVIDIA Data Center GPU Manager (DCGM) is a project for gathering telemetry and measuring the health of NVIDIA GPUs
Apache License 2.0
373 stars 49 forks source link

Support for A30/40 and L30 GPUs? #97

Open SamKG opened 1 year ago

SamKG commented 1 year ago

Hello,

We are planning on ordering A30/A40/L30 gpus for a research cluster. However, we are unsure if DCGM fully supports these GPUs - in particular, for profiling tensor core utilization and SM occupancy. Is this functionality supported?

Thanks!

bstollenvidia commented 1 year ago

DCGM supports profiling metrics on all Volta and newer compute GPUs, including Ampere and Ada ones.