Closed heshaaam closed 1 year ago
We can look into it.
For now, they can use nvidia-smi
in the terminal or ! nvidia-smi
from a Jupyter notebook cell.
It shows the total GPU RAM utilisation and the running processes as the following example:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.13 Driver Version: 525.60.13 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:D8:00.0 Off | 0 |
| N/A 31C P8 10W / 70W | 2MiB / 15360MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Also you can use nvidia-smi --query-gpu=utilization.gpu,utilization.memory --format=csv
For shorter output
Or nvidia-smi dmon
Monitors default metrics for up to 4 supported devices under natural enumeration (starting with GPU index 0) at a frequency of 1 sec. Runs until terminated with ^C.
Read more about it in the "Device Monitoring" section: https://www.systutorials.com/docs/linux/man/1-nvidia-smi/
We created this dashboard which can be used with the above to view utilisation of the cluster resources: https://hayrat.uob.edu.bh/stats/general
Salam.
Is it possible to install gpustat https://pypi.org/project/gpustat/ so that users can see their utilization of the GPU and its memory by their running job? This command should be run on the compute node itself, after they ssh there.
Or may be there's an easier way to do this?