paradoxxxzero / gnome-shell-system-monitor-applet

Display system informations in gnome shell status bar, such as memory usage, cpu usage, network rates…
GNU General Public License v3.0
1.75k stars 322 forks source link

Add Nvidia GPU monitor display #356

Open indigohedgehog opened 7 years ago

indigohedgehog commented 7 years ago

Could you add a Nvidia GPU monitor with nvidia-smi? Nvidia-smi provides gpu utilization, power, memory via queries for each gpu.

Like this nvidia-smi -i 0 -q -d MEMORY,UTILIZATION,POWER,CLOCK,COMPUTE

DavidLKing commented 7 years ago

I second this.

franglais125 commented 7 years ago

I don't own a Nvidia card, so I can't test this. What is the output of that command? Knowing that, I can maybe devise a way of adding this.

DavidLKing commented 7 years ago

So, for nvidia-smi in general, my output looks like this:

Tue Aug 15 16:05:42 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66                 Driver Version: 375.66                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 960     Off  | 0000:01:00.0      On |                  N/A |
| 39%   51C    P2    36W / 120W |   1891MiB /  1995MiB |     43%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1254    G   /usr/lib/xorg/Xorg                             167MiB |
|    0      2160    G   /usr/bin/gnome-shell                           115MiB |
|    0      3000    G   ...el-token=A6C2B75F4CE1D547DA671EFF268D0610    88MiB |
|    0      9425    G   ...avid/.local/share/Steam/ubuntu12_32/steam    34MiB |
|    0      9482    G   ...el-token=74F5F5AF9CB6C99D2C2D33A0F932ACD0    33MiB |
|    0     20398    C   python2                                       1448MiB |
+-----------------------------------------------------------------------------+

@indigohedgehog's command has the following output:

==============NVSMI LOG==============

Timestamp                           : Tue Aug 15 16:00:35 2017
Driver Version                      : 375.66

Attached GPUs                       : 1
GPU 0000:01:00.0
    FB Memory Usage
        Total                       : 1995 MiB
        Used                        : 1934 MiB
        Free                        : 61 MiB
    BAR1 Memory Usage
        Total                       : 256 MiB
        Used                        : 5 MiB
        Free                        : 251 MiB
    Compute Mode                    : Default
    Utilization
        Gpu                         : 47 %
        Memory                      : 6 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    GPU Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 48 %
        Min                         : 33 %
        Avg                         : 0 %
    Memory Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 7 %
        Min                         : 1 %
        Avg                         : 0 %
    ENC Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 0 %
        Min                         : 0 %
        Avg                         : 0 %
    DEC Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 0 %
        Min                         : 0 %
        Avg                         : 0 %
    Power Readings
        Power Management            : Supported
        Power Draw                  : 37.73 W
        Power Limit                 : 120.00 W
        Default Power Limit         : 120.00 W
        Enforced Power Limit        : 120.00 W
        Min Power Limit             : 60.00 W
        Max Power Limit             : 130.00 W
    Power Samples
        Duration                    : 11.80 sec
        Number of Samples           : 119
        Max                         : 38.83 W
        Min                         : 36.16 W
        Avg                         : 37.01 W
    Clocks
        Graphics                    : 1404 MHz
        SM                          : 1404 MHz
        Memory                      : 3004 MHz
        Video                       : 1151 MHz
    Applications Clocks
        Graphics                    : 1266 MHz
        Memory                      : 3505 MHz
    Default Applications Clocks
        Graphics                    : 1266 MHz
        Memory                      : 3505 MHz
    Max Clocks
        Graphics                    : 1506 MHz
        SM                          : 1506 MHz
        Memory                      : 3505 MHz
        Video                       : 1235 MHz
    SM Clock Samples
        Duration                    : 61691.60 sec
        Number of Samples           : 100
        Max                         : 1405 MHz
        Min                         : 135 MHz
        Avg                         : 1266 MHz
    Memory Clock Samples
        Duration                    : 61691.60 sec
        Number of Samples           : 100
        Max                         : 3505 MHz
        Min                         : 405 MHz
        Avg                         : 2978 MHz
    Clock Policy
        Auto Boost                  : N/A
        Auto Boost Default          : N/A

1000 thanks for looking into this!

franglais125 commented 7 years ago

Ohh, I see. Without seeing the output I had no idea of what the request was really. I thought this was merely wrt the temperature.

So the idea would be to, for example, track the % GPU utilization with a graph, just like we do with CPU?

DavidLKing commented 7 years ago

Exactly, and/or VRAM usage

franglais125 commented 7 years ago

So, taking @indigohedgehog 's idea. Can we run the following code to get the GPU utilization alone, without all the other garbage? So we can use the output as a raw number.

nvidia-smi -i 0 -q -d MEMORY | grep Gpu | awk '{print $3}'

franglais125 commented 7 years ago

Assuming that that works, I'd then be worried about CPU usage to get this information. Can you try the following, and see what CPU usage you get? (With other tasks suspended obviously).

for i in {1..120}; do nvidia-smi -i 0 -q -d MEMORY | grep Gpu | awk '{print $3}'; sleep 0.5; done

This will print the information every 500ms, 120 times (so a full minute).

DavidLKing commented 7 years ago

Hm... I'm not getting a third argument in nvidia-smi -i 0 -q -d MEMORY | grep Gpu | awk '{print $3}' ---that is, I don't get any return from running that command. If you're looking for the used GPU memory report, I can get it with nvidia-smi -i 0 -q -d MEMORY | grep -A3 -i gpu | grep -i used | awk '{print $3}'. That returns 1876 from the line: Used : 1867 MiB

I'm currently running an experiment, but as soon as it's finished, I'll get the cpu usage for you.

franglais125 commented 7 years ago

Ah, I see, perhaps we needed the argument UTILIZATION instead of MEMORY? I was looking at simply getting the utilization so far.

For memory, we would need the used memory and the total value as well.

We actually only need the total value once, as this is not supposed to change. The reason I try to keep the commands giving as little output as possible is to prevent CPU usage to be large. But since you have the tools, you might be able to come up with a better example.

DavidLKing commented 7 years ago

For MEMORY, get the total memory as the first line, with the used memory on the second with nvidia-smi -i 0 -q -d MEMORY | grep -A4 -i gpu | egrep -i "used|total" | awk '{print $3}'

UTILIZATION's output look like as follows:

david@Arjuna:~/Desktop$ nvidia-smi -i 0 -q -d UTILIZATION

==============NVSMI LOG==============

Timestamp                           : Tue Aug 15 16:56:23 2017
Driver Version                      : 375.66

Attached GPUs                       : 1
GPU 0000:01:00.0
    Utilization
        Gpu                         : 38 %
        Memory                      : 3 %
        Encoder                     : 0 %
        Decoder                     : 0 %
    GPU Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 45 %
        Min                         : 31 %
        Avg                         : 0 %
    Memory Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 5 %
        Min                         : 1 %
        Avg                         : 0 %
    ENC Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 0 %
        Min                         : 0 %
        Avg                         : 0 %
    DEC Utilization Samples
        Duration                    : 18446744073709.22 sec
        Number of Samples           : 99
        Max                         : 0 %
        Min                         : 0 %
        Avg                         : 0 %

From there, nvidia-smi -i 0 -q -d UTILIZATION | grep Gpu | awk '{print $3}' indeed returns the percent utilized.

franglais125 commented 7 years ago

So for the memory part, your command, would this a correct example of input/output?

$ nvidia-smi -i 0 -q -d MEMORY | grep -A4 -i gpu | egrep -i "used|total" | awk '{print $3}'
1934
1995
DavidLKing commented 7 years ago

Close, just invert your lines:

$ nvidia-smi -i 0 -q -d MEMORY | grep -A4 -i gpu | egrep -i "used|total" | awk '{print $3}'
1995
1877
franglais125 commented 7 years ago

Ah, perfect! Btw, I already have a sort-of working prototype. I've been using echo 1000 as dummy inputs. I'll push a branch when it's sort of working, and then we can iron out the bugs.

DavidLKing commented 7 years ago

Sounds great. Let me know if/how I can help. 1000 thanks, again. Might I request a fuchsia, for the color ;)

franglais125 commented 7 years ago

Ok, first part: https://github.com/franglais125/gnome-shell-system-monitor-applet/tree/gpu_usage

Do you mind trying it? So far it should only show the correct amount on vRAM used when you open the menu.

To test it:

git clone https://github.com/franglais125/gnome-shell-system-monitor-applet/
cd gnome-shell-system-monitor-applet
git checkout gpu_usage
make install

and restart the Shell. Let me know how it goes! Remember, this is WIP :)

franglais125 commented 7 years ago

I think the branch is ready for testing. @DavidLKing, how is it looking for you?

screenshot from 2017-08-15 19-30-55

screenshot from 2017-08-15 19-31-11

DavidLKing commented 7 years ago

Alas screenshot from 2017-08-15 19-31-23

So, I can select it fine, but there's a gap where the GPU info ought to be.

franglais125 commented 7 years ago

It happened to me too when activating the GPU from the settings. Can you leave it activated, and then restart the Shell?

DavidLKing commented 7 years ago

Perfection screenshot from 2017-08-15 19-35-22 You'll also want to note somewhere that this only works with nvidia cards. AMD/OpenCL folks might get confused

franglais125 commented 7 years ago

Wow, this is great!

And yes, indeed... There are still some details to work out. I don't know if there is anything equivalent for AMD and intel GPUs. It would be great to have that too.

Also, I want to fix this bug where the information is empty when activating the Gpu from the settings.

In any case, thanks a lot for your help and patience!


@chrisspen : would you mind taking a look at this if I open a PR? (and the other PRs/fixes).

At this point the inaction is actually hindering the maintenance/development. I have a few PRs waiting, and I will have to end up rebasing a bunch of stuff that I haven't touched in months.

Cheers

DavidLKing commented 7 years ago

1000 thanks, @franglais125 ! Looking good so far. Let me know if I can do anything to help in the future.

chrisspen commented 7 years ago

@franglais125 Which PRs are holding you up? I just merged a few I saw that looked ok, but most seem to be failing some checks.

franglais125 commented 7 years ago

@chrisspen Thanks for getting back to me, and for the recent merges!

These ones are missing: https://github.com/paradoxxxzero/gnome-shell-system-monitor-applet/pull/375, https://github.com/paradoxxxzero/gnome-shell-system-monitor-applet/pull/338

I'm not saying it should be merged as-is, but we can of course go over some of the changes (I'll have to take a look again as I don't remember everything).

As for the checks that are failing, it's due to travis not being able to setup the test system, it's not due to the commits/styles. I'd love to check that too, but travis doesn't give any useful information atm.

screenshot from 2017-08-15 21-22-32

chrisspen commented 7 years ago

Ok, it looks like NPM can't be installed, likely because their master branch has some issues. I'll research why, but I can't proceed with merges until I'm able to fix that.

On Tue, Aug 15, 2017 at 9:23 PM, Fran Glais notifications@github.com wrote:

@chrisspen https://github.com/chrisspen Thanks for getting back to me, and for the recent merges!

This one is missing: #375 https://github.com/paradoxxxzero/gnome-shell-system-monitor-applet/pull/375 .

I'm not saying it should be merged as-is, but we can of course go over some of the changes (I'll have to take a look again as I don't remember everything).

As for the checks that are failing, it's due to travis not being able to setup the test system, it's not due to the commits/styles. I'd love to check that too, but travis doesn't give any useful information atm.

[image: screenshot from 2017-08-15 21-22-32] https://user-images.githubusercontent.com/19195975/29343563-e3c3157e-81ff-11e7-9a74-94915057894a.png

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/paradoxxxzero/gnome-shell-system-monitor-applet/issues/356#issuecomment-322635361, or mute the thread https://github.com/notifications/unsubscribe-auth/AAHHl-BaV8NC4lSULTNm_ircKkI-OS-_ks5sYkR6gaJpZM4M4QLz .

franglais125 commented 7 years ago

Thanks! Unfortunately I don't have experience dealing with travis to be able to help.

indigohedgehog commented 7 years ago

Running neat. @franglais125 for Intel Gpu, you could use _intel_gputop.

https://stackoverflow.com/questions/28876242/interpretation-of-intel-gpu-top-output

franglais125 commented 7 years ago

@indigohedgehog Thanks a lot for the tip! I'll try to look into it when I get some time.

franglais125 commented 7 years ago

@indigohedgehog the problem with intel_gpu_top is that it requires sudo, so we can't really run it unless we create some pkexec policy.