CFSworks / nvml_fix

A workaround for an annoying bug in nVidia's NVML library. Allows nvidia-smi to work once more!
98 stars 19 forks source link

Doesn't work with GT710 + 440.36 + CUDA 10.2 #29

Closed gonzalezcalleja closed 4 years ago

gonzalezcalleja commented 4 years ago

Hi,

I have problems to enable process usage information from nvidia-smi with:

Here is the output:

image

And here attached output debug logs:

nvidia-smi.txt nvidia-smi-strace.txt

I also have in nvidia-settings the powermizer settings to "Prefer Maximum Performance".

Thanks for your support!

tofurky commented 4 years ago

hi, thanks for providing the detailed information right off the bat :) it appears that everything is working as it should be from nvml_fix's end. you can confirm by reverting to stock configuration, and re-running nvidia-smi -a. notice the following from the output you provided:

Attached GPUs                       : 1
GPU 00000000:07:00.0
    Product Name                    : GeForce GT 710
    Product Brand                   : Quadro
    Display Mode                    : Enabled
    Display Active                  : Enabled
    Persistence Mode                : Disabled

it is being detected as Quadro, which means that it's tricking the libnvidia-ml library as it is supposed to. i think unfortunately that in these cases, for some reason the chipset itself, or the video bios, is not capable of reporting the power utilization/processes :( if you revert to original config, and see nvidia-smi -a again, it will likely show:

    Product Brand                   : GeForce
gonzalezcalleja commented 4 years ago

Hi,

Thanks for your quick answer!

With "stock nvidia lib" i have notice that i can't see the CPU usage, but with your patch I can see it, so this should be enough.

Screenshot 2019-12-27 at 15 26 32

I think that i will bought a bigger GPU because i'm starting to play with tensorflow ... are the any list of compatible GPUs with nvml_fix? Thanks!

tofurky commented 4 years ago

sorry, there isn't a list of compatible GPUs. i only have a single nvidia card (650 ti boost) which is of course working :) i suspect that cards that use the same underlying gpu chips that the quadro line uses would probably work. if you go to https://en.wikipedia.org/wiki/Nvidia_Quadro#Quadro, there is a column named "Near GeForce Model". no guarantees of course but i would say that is an educated guess.

tofurky commented 4 years ago

closing as this appears to be a hardware/firmware limitation rather than something within the scope of things nvml_fix could address.