Open guillaumeramey opened 3 years ago
Could you let us know what output you get if you run this from the command line on the machine you're using? This will help narrow down the source of the error.
$ nvidia-smi pmon -c 10
I am using Google Colab so it's not always the same GPU.
I ran subprocess.getoutput('nvidia-smi pmon -c 10')
but it gave me nothing:
# gpu pid type sm mem enc dec command
# Idx # C/G % % % % name
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
0 - - - - - - -
With subprocess.getoutput('nvidia-smi')
I obtained this:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 |
| N/A 45C P8 29W / 149W | 0MiB / 11441MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
Hi, unfortunately colab isn't fully supported right now because they don't always expose the required hardware endpoints to calculate energy use. We are working on solutions and will follow up if we have something that works.
Hi, we're getting this error in the log file :
Is it an issue with our Nvidia GPU ? We are using Tesla T4.