Syllo / nvtop

GPU & Accelerator process monitoring for AMD, Apple, Huawei, Intel, NVIDIA and Qualcomm
Other
8.06k stars 292 forks source link

[Enhancement] Monitoring % utilization of tensor cores #163

Open johnnynunez opened 2 years ago

johnnynunez commented 2 years ago

Nvidia told me to use nvidia profiler to monitor the tensor cores or nvprof. But could you add to this great tool to know if my RTX 3090 is really using the tensor cores?

https://developer.nvidia.com/blog/using-nsight-compute-nvprof-mixed-precision-deep-learning-models/

Syllo commented 2 years ago

Hello @johnnynunez,

I looked at the documentation and it seems to me that the tensor cores are treated as part of the compute resources. I cannot find any function to retrieve their utilization specifically. From the looks of it, the only way to see if these resources are utilized is to look for some function/instruction inside a trace. However, nvtop is limited by the information exposed by the GPU driver.

Maybe you have more information than me on this subject?

johnnynunez commented 1 year ago

Hello @johnnynunez,

I looked at the documentation and it seems to me that the tensor cores are treated as part of the compute resources. I cannot find any function to retrieve their utilization specifically. From the looks of it, the only way to see if these resources are utilized is to look for some function/instruction inside a trace. However, nvtop is limited by the information exposed by the GPU driver.

Maybe you have more information than me on this subject?

pytorch has the capability to watch tensor cores percentatge. Is it possible to use here?

image