[REQUEST] Add GPU monitoring

tom-doerr commented 4 years ago

Is your feature request related to a problem? Please describe. I'm always frustrated when I have to open another window to monitor the usage of my GPUs.

Describe the solution you'd like I would love to see the GPU utilization and memory usage in bpytop when bpytop detects dedicated GPUs. Users that made the effort to install a dedicated GPU or bought a computer with a dedicated GPU likely care a lot about the GPU usage.

Describe alternatives you've considered An alternative would be to open nvtop or run watch nvidia-smi.

aristocratos commented 4 years ago

Nvidia cards can probably be worked in since https://github.com/wookayin/gpustat could be used to poll data.

AMD is probably possible too with https://github.com/Ricks-Lab/amdgpu-utils but requires more user setup.

I haven't found any good candidate for integrated intel graphics which I would like, to have proper GPU monitoring support.

I however don't have either an nvidia card or amd card to test with, so it's a bit tricky doing it blind.

protofunctorial commented 4 years ago

I would love to see this feature added! I thought I'd contribute gpu-ls output for my AMD card:

Detected GPUs: AMD: 1
AMD: amdgpu version: 19.1.0-2
AMD: Wattman features enabled: 0xffffffff
Warning: Error reading parameter: mem_loading, disabling for this GPU: 0
1 total GPUs, 1 rw, 0 r-only, 0 w-only

Card Number: 0
   Vendor: AMD
   Readable: True
   Writable: True
   Compute: False
   GPU UID: 02151af2e48a2184
   Device ID: {'device': '0x687f', 'subsystem_device': '0x2388', 'subsystem_vendor': '0x148c', 'vendor': '0x1002'}
   Decoded Device ID: Vega 10 XL/XT [Radeon RX Vega 56/64]
   Card Model: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c3)
   Display Card Model: Vega 10 XL/XT [Radeon RX Vega 56/64]
   PCIe ID: 0a:00.0
      Link Speed: 8.0 GT/s PCIe
      Link Width: 16
   ##################################################
   Driver: amdgpu
   vBIOS Version: 111
   Compute Platform: None
   GPU Type: PStates
   HWmon: /sys/class/drm/card0/device/hwmon/hwmon3
   Card Path: /sys/class/drm/card0/device
   System Card Path: /sys/devices/pci0000:00/0000:00:03.1/0000:08:00.0/0000:09:00.0/0000:0a:00.0
   ##################################################
   Current Power (W): 38.000
   Power Cap (W): 185.000
      Power Cap Range (W): [0, 277]
   Fan Enable: 0
   Fan PWM Mode: [2, 'Dynamic']
   Fan Target Speed (rpm): 630
   Current Fan Speed (rpm): 630
   Current Fan PWM (%): 15
      Fan Speed Range (rpm): [900, 3800]
      Fan PWM Range (%): [0, 100]
   ##################################################
   Current GPU Loading (%): 87
   Current Memory Loading (%): None
   Current GTT Memory Usage (%): 8.045
      Current GTT Memory Used (GB): 0.642
      Total GTT Memory (GB): 7.984
   Current VRAM Usage (%): 20.886
      Current VRAM Used (GB): 1.668
      Total VRAM (GB): 7.984
   Current  Temps (C): {'edge': 52.0, 'junction': 57.0, 'mem': 52.0}
   Critical Temps (C): {'edge': 85.0, 'junction': 105.0, 'mem': 95.0}
   Current Voltages (V): {'vddgfx': 856}
      Vddc Range: ['800mV', '1200mV']
   Current Clk Frequencies (MHz): {'mclk': 167.0, 'sclk': 1359.0}
   Current SCLK P-State: [0, '852Mhz']
      SCLK Range: ['852MHz', '2400MHz']
   Current MCLK P-State: [0, '167Mhz']
      MCLK Range: ['167MHz', '1500MHz']
   Power Profile Mode: 0-BOOTUP_DEFAULT
   Power DPM Force Performance Level: auto

If @tom-doerr can contribute outputs for NV cards, I think it is doable. I certainly wouldn't mind testing an experimental branch for rickslab-gpu-utils -based monitoring support, and even contributing a little. With regards to Intel iGPUs, I believe their users have less motives for active monitoring anyway, so it can be delayed for now.

schaerfo commented 4 years ago

The amdgpu driver provides a hwmon interface (see https://www.kernel.org/doc/html/latest/gpu/amdgpu.html#hwmon-interfaces) which is also available in sensors command. The GPU utilization percentage can be read from /sys/class/drm/cardX/device/gpu_busy_percent.

aristocratos commented 4 years ago

That makes it a lot more viable, since https://github.com/Guillermo-C-A/Hwmon-python could be used to extract that information.

I will have nvidia gpu to test with in about a month, but without an actual amd gpu it's unlikely bpytop will have amd support, unless somebody else who can also test it writes that implementation.

drazil100 commented 4 years ago

I have an amd gpu in my machine. If you wanted to play around with GPU monitoring in a feature branch or something I would be happy to try it on my system. Wouldn't be able to write anything for it unfortunately.

My gpu is a little old though (Radeon HD 7730M). Not sure if that would make any kind of difference whatsoever.

My biggest concern is where you would even PUT gpu monitoring. The screen space is already so full of content and adding more would have to push something else into a smaller box. Everything is already so perfectly spaced though xD

aristocratos commented 4 years ago

If you wanted to play around with GPU monitoring in a feature branch or something I would be happy to try it on my system.

I appreciate the offer but would be very slow and annoying to not have live data to adapt to when coding it.

My biggest concern is where you would even PUT gpu monitoring

The idea for the design would be too add a gpu box sharing a third of the cpu box space and have similar design to the cpu except having vram usage, clocks and temperatures in the box were cores are shown for the cpu. Would also add a shortcut to toggle easily.

jorge-barreto commented 3 years ago

hey @aristocratos, I have a Radeon RX 5xx, would love to see this feature in BpyTOP, and would love to take a hack at implementing the amdgpu side of things.

The idea for the design would be too add a gpu box sharing a third of the cpu box space

Have you been able to get very far with this? I'd like to not have to deal with design if I could help it 😅 I'm going to dive into the code soon, but would appreciate any pointers on where I should look

aristocratos commented 3 years ago

@schaerfo @jorge-barreto That would be great! I'm still haven't gotten my new gpu yet, so haven't started writing it. Will probably be a couple of weeks till then.

But if you don't wanna have to deal with design I would recommend you wait till I have finished the nvidia portion and the draw functions (which should be agnostic to amd/nvidia and possibly intel cards).

If you want I can create a dummy collection function for amd when I create the nvidia function, then you could just fill in the missing pieces.

I'm not sure yet though what the best way to collect amd stats is. https://github.com/Guillermo-C-A/Hwmon-python is one way, but maybe it's possible to collect the same info from /sys or /proc if the amd drivers populates those?

That way we could avoid adding more dependencies. I'm gonna investigate what info that can be collected through /sys and /proc on the nvidia side when I get the new card.

schaerfo commented 3 years ago

@aristocratos

But if you don't wanna have to deal with design I would recommend you wait till I have finished the nvidia portion and the draw functions (which should be agnostic to amd/nvidia and possibly intel cards).

That would make things a lot simpler :+1:

I'm not sure yet though what the best way to collect amd stats is. https://github.com/Guillermo-C-A/Hwmon-python is one way, but maybe it's possible to collect the same info from /sys or /proc if the amd drivers populates those?

All hwmon information can also simply be read from /sys/class/drm/card*/device/hwmon/hwmon*

aristocratos commented 3 years ago

All hwmon information can also simply be read from /sys/class/drm/card/device/hwmon/hwmon

That's great to hear, I'm hoping this is the same for nvidia.

Anyways, I will update this issue when I start working on the gpu functions.

jorge-barreto commented 3 years ago

@aristocratos i've gone ahead and done a mockup of the GPU box placement. here's a picture: bpytop

i'll be working on the GpuCollector class this weekend. you can find my code here.

aristocratos commented 3 years ago

@jorge-barreto This placement would be problematic considering the info box that pops up when pressing enter on a process takes up 7 lines in height, and would with this placement in a smaller terminal height not leave any room for the process list. I think the only possible placement is in a split (maybe a third) of the cpu box. But it will get crowded on smaller terminal sizes wherever it is placed.

i'll be working on the GpuCollector class this weekend. you can find my code here.

Great! I'm guessing the prioritized stats to gather would be gpu usage, temp, core clock, vram size, vram usage, vram clock and gpu model name maybe, are there any other important ones you can think of?

I'm thinking design wise of a clone of the cpu box, but with the previous mentioned stats in the smaller box instead of cores.

jorge-barreto commented 3 years ago

@aristocratos The picture above (and below) is in a 80x24 terminal, and I see what you mean about the process pop up. The design is ultimately up to you, but I worry about how much CpuBox we will have in the smaller terminal if we cut a third of its box's width. In the picture, you can see I lose some of my cores:

66_p

Perhaps we could consider hiding the GpuBox when the process detail view is in effect?

Either way I'm not super concerned about the design. Though, I may consider keeping a personal build with the GpuBox in my preferred location.

-- As far as stats go, I currently had this list:

GPU Load %
Memory Load %
VRAM Usage, Max
GTT Usage, Max
SCLK Current, Max
MCLK
Temperature
Power Draw Avg
Fan RPM Current, Max

We can deprioritize whatever of those as the need for space dictates. I'm not totally sure on how to get the VRAM clock, or exactly what it is.

The GPU model name is an interesting one:

For me, lspci -v | grep VGA reports Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]. I guess we could truncate to Radeon RX? But given the amount of VGA compatible controllers out there, I'm not totally sure how much useful parsing we could do. What're your thoughts on this?

aristocratos commented 3 years ago

@jorge-barreto

The design is ultimately up to you, but I worry about how much CpuBox we will have in the smaller terminal if we cut a third of its box's width. In the picture, you can see I lose some of my cores:

You might have a good point there, I also realized the alignment of the "Menu", "mode:", clock and battery meter would be a nightmare to get in a good place when looking at the mock up.

Actually I think I could make it so it automatically moves between the two placements depending on the size of the terminal.

I'm pretty sure MCLK is the VRAM clock but in amd naming.

And I'm guessing SCLK ≈ core clock for amd?

I'm not quite sure what GTT Usage is though?

Any stat could be added and then simply be shown/hidden by sizing constraints.

The GPU model name is an interesting one

You could possibly look through projects like https://pypi.org/project/gpuinfo/ or https://github.com/Ricks-Lab/amdgpu-utils too see what methods they use for model name. If they aren't using C code for stuff like that it could be a good reference.

Also if you're going the route of reading the /sys subsystem manually to collect all the information, it could be good to compare the speeds with something like https://github.com/Guillermo-C-A/Hwmon-python ~~I haven't checked if they are using C functions though~~ apparently they don't. But if the data collection does impact the resource use a lot it could be a good idea to think about if there is any good pip available libraries to use.

jorge-barreto commented 3 years ago

@aristocratos So, it looks like radeontop reports MCLK as 'Memory Clock', and SCLK as 'Shader Clock'. I believe nvidia-smi reports SCLK as the 'Graphics (shader) Clock', judging by this man page. It also reports the 'Memory Clock' and the 'SM Clock', which might be an NVIDIA specific thing.

I'm not 100% on what GTT Usage is, either. Some searching suggests that GTT is the Graphics Translation Table, which "exposes a linear buffer from multiple RAM pages". I'm guessing that's not AMD specific? But you'll have to let me know what you find, because I can't really find any mentions of it in the context of NVIDIA.

I'll definitely look through those projects to see how it is they're grabbing the model name, and how they implement grabbing the actual info. I'll take a closer look at the TimeIt class to try and do some benchmarks, too. It's not immediately obvious to me how to use it, but I haven't really looked into it.

Update -- All of the linked projects simply use os.open, except for gpuinfo, which seems to directly use nvidia-smi. For what it's worth, I think nvidia-smi does ship with NVIDIA's Linux drivers, but the same is not clear for other platforms.

jorge-barreto commented 3 years ago

@aristocratos This GPUmodule seems to have accomplished most of what we might need, but is also a 2000+ line file. I am guessing we do not want to bring in that much bloat into BpyTOP?

Also, it looks like name parsing might be best accomplished by matching the pid found in /sys/class/drm/card0/subsystem_device to the relevant entry in the pci.ids files.

aristocratos commented 3 years ago

@jorge-barreto

I'll take a closer look at the TimeIt class to try and do some benchmarks, too. It's not immediately obvious to me how to use it, but I haven't really looked into it.

Usage is just TimeIt.start("testing gpu collect") at start and then TimeIt.stop("testing gpu collect") will print 'testing gpu collect completed in X.XXXXX seconds to the error.log, need to have the loglevel set to debug in the config though.

I'm not 100% on what GTT Usage is, either. Some searching suggests that GTT is the Graphics Translation Table, which "exposes a linear buffer from multiple RAM pages". I'm guessing that's not AMD specific?

We could probably skip that for now then and focus on stats that would exist for both amd and nvidia, would also make the draw function a lot more straight forward. I'm guessing the top priority stats for most people would be gpu usage, temperature, fan speed, memory usage and core(shader)/memory clocks (for people overclocking).

Update -- All of the linked projects simply use os.open,

That's pretty telling that the direct approach shouldn't be an issue then.

For what it's worth, I think nvidia-smi does ship with NVIDIA's Linux drivers, but the same is not clear for other platforms.

I haven't really investigated if the nvidia drivers populate /sys or what other methods could be used, but would really like to avoid having to run an external binary at every update.

This GPUmodule seems to have accomplished most of what we might need, but is also a 2000+ line file. I am guessing we do not want to bring in that much bloat into BpyTOP?

Yeah, would like to avoid it if you're up for the challenge?

Also, it looks like name parsing might be best accomplished by matching the pid found in /sys/class/drm/card0/subsystem_device to the relevant entry in the pci.ids files.

Oh, neat. That could possibly be a separate function in the GpuCollector class then, since it would only need to be run once and would get name of either amd or nvidia gpus.

jorge-barreto commented 3 years ago

@aristocratos

Usage is just TimeIt.start ...

Thank you!

We could probably ... focus on stats that would exist for both ...

I agree that it'd be simple to focus on what exists for both. I do wonder: do we want to give consideration to other vendors? And if so, how much? It'd be interesting to find out just how universal the hwmon stuff is.

Yeah, would like to avoid it if you're up for the challenge?

Yeah, absolutely! I do work business hours during the week, but I don't think it should take me too long to get something basic out that we can start testing.

That could possibly be a separate function in the GpuCollector class ...

That makes sense. We could do something similar to how the CPU name is set now.

Anyway, I'll update this when I have a working demo.

aristocratos commented 3 years ago

I do wonder: do we want to give consideration to other vendors?

Getting some basic support for builtin graphics would be nice. Looking at /sys/class/drm/card0/ for my intel graphics I did find gt_act_freq_mhz and others that could yield some stats, but maybe not as much as we would want. So if you find anything useful that might work for built-in graphics (amd or intel) that would be great. My thinking is still that the gpu box will be toggeable from the options, so if someone doesn't think the stats provided is useful, they could just turn it off.

jorge-barreto commented 3 years ago

@aristocratos Ok, parsing that pci.ids file was not as bad as I thought it'd be. One concern is that I am unsure if OSX has the relevant file, and if so, where it might be found. I was not able to find conclusive evidence one way or another. As a side note, it turns out that lspci only messes up my particular model's name because the pci.ids file actually has it mislabeled (lol). I've submitted a change request to their db.

I've got most of the collection done for Linux systems. At this point, I'm not sure how much of that will be portable over to OSX and FreeBSD. Hopefully, most of it :grin: Theoretically, I could do some testing with FreeBSD, but have no access to a mac for testing.

I'll be ironing out the GpuCollector class over the next couple days. I've tried my best to be agnostic towards how the systems are laid out, but I'm sure we'll have to make some changes. Afterwards I'll dig into how the graphs and other nice visuals are made. Here's what we got so far:

Screenshot from 2020-09-14 00-27-14

aristocratos commented 3 years ago

@jorge-barreto

Here's what we got so far

Great work!

I'm not sure how much of that will be portable over to OSX and FreeBSD

OSX is a problem I'm gonna have to look into, hopefully something like @hacker1024 suggested for cpu temp in #119 could be used for GPU also.

jorge-barreto commented 3 years ago

I wasn't able to get a ton done this weekend. Mostly fiddled around with making some meters and representing the rest of the information I had been able to fetch last time.

Screenshot from 2020-09-20 23-09-42

I'd love some ideas on how to populate the larger views.

Screenshot from 2020-09-20 23-13-40

aristocratos commented 3 years ago

@jorge-barreto Haven't gotten my new gpu yet, was naively thinking I would be able to get a rtx 3080 at launch :( So haven't started coding anything yet.

Regarding design I'm thinking something along the lines of:

Placement:
- On small terminal width <~120 characters: above proc
- On larger terminal width >~120 characters: right of cpu taking up around 33% total width
Design:
- Large version: Graph of gpu usage on the left, box with all stats and meters on the right
- Small version: Two rows of stats and meters (pretty much what you've got in the screenshot, but with a divider and right/left alignment of labels and values).

If any unused space is left in the small version, the gpu usage graph could be added below all values and adapt to the available height.

bjtho08 commented 3 years ago

Any updates on this request? I'm anxiously awaiting this feature addition :)

aristocratos commented 3 years ago

@bjtho08 Still waiting for a rtx 3080 I ordered on release day, looks like it's gonna be a while. And don't wanna start coding it before I can get live data and figure out the best collection method for nvidia.

hacker1024 commented 3 years ago

As for the macOS side, I have exams coming up so that might be a while (unless someone else does it of course).

Slaviusz commented 3 years ago

One note maybe. On Linux kernel with proprietary binary NVIDIA driver there's no /sys/class/drm/cardX/device/hwmon entry. Typically NVIDIA provides their own utility called nvidia-smi which I asume reads this using proprietary ABI.

aristocratos commented 3 years ago

@Slaviusz Would have liked to avoid calling an external executable but suspected this was the case. Looking at tools like https://github.com/pmav99/nvsmi it shouldn't be too much trouble, just a bit worried about possible slowdown since I haven't had the chance to test the speeds of nvidia-smi yet.

Slaviusz commented 3 years ago

@aristocratos I did strace run of nvidia-smi and it seems some information can also be retrieved without nvidia-smi. Again on a system with NVIDIA binary proprietary driver the list of cards can be found here: /proc/driver/nvidia/gpus/<PCI_SLOT> (in form of [domain:]bus:device.function) Basic card information can be retrieved here:

$ cat /proc/driver/nvidia/gpus/<PCI_SLOT>/information
Model:           GeForce GTX 1070 Ti
IRQ:             80
GPU UUID:        GPU-c3c9859d-836d-5ed5-f300-0406fbb1dd6e
Video BIOS:      86.04.85.00.a4
Bus Type:        PCIe
DMA Size:        47 bits
DMA Mask:        0x7fffffffffff
Bus Location:    0000:2d:00.0
Device Minor:    0
Blacklisted:     No

Other than that it is all a bunch of IOCTL calls to NVIDIA kernel driver.

aristocratos commented 3 years ago

@Slaviusz I think the only relevant info from there would be the model name so nvidia-smi it is.

Would you mind running time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem a couple of times and see what the results are?

tomekziel commented 3 years ago

Pick me! Pick me!

--format=csv was required as well, here is the output

user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 20 %, 11019 MiB, 0 MiB, 0 %, 60, 33.09 W, 1350 MHz, 7000 MHz

real    0m0,641s
user    0m0,000s
sys     0m0,581s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 28 %, 11019 MiB, 0 MiB, 0 %, 60, 33.30 W, 1350 MHz, 7000 MHz

real    0m0,644s
user    0m0,000s
sys     0m0,584s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 32 %, 11019 MiB, 0 MiB, 0 %, 60, 33.08 W, 1350 MHz, 7000 MHz

real    0m0,705s
user    0m0,001s
sys     0m0,645s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 34 %, 11019 MiB, 0 MiB, 0 %, 60, 32.85 W, 1350 MHz, 7000 MHz

real    0m0,638s
user    0m0,005s
sys     0m0,575s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 19 %, 11019 MiB, 0 MiB, 0 %, 58, 32.04 W, 1350 MHz, 7000 MHz

real    0m0,648s
user    0m0,000s
sys     0m0,585s

aristocratos commented 3 years ago

@tomekziel You're the chosen one :)

real 0m0,641s

That's not looking really hopeful. Would you mind testing other output formats and if that's no different testing the different queries separately to see if there is any particular query that is taking more time?

tomekziel commented 3 years ago

Whatever command involving communication with (idle) GPU I try, does not go below 0.5 sys time.

Driver and tool version: NVIDIA System Management Interface -- v450.80.02

Uname 5.4.0-48-generic #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

aristocratos commented 3 years ago

@tomekziel

Whatever command involving communication with (idle) GPU I try, does not go below 0.5 sys time.

That's too bad, means I gonna have to run it in a separate thread which means the stats might lag behind and not be true realtime updates. Gonna do some more extensive testing when I get my gpu and start coding the collection functions.

Thanks for the help!

tomekziel commented 3 years ago

I will be happy with 1 sec updates. Still using glances gpu, waiting for bpytop with GPU support 💕

derpeter commented 3 years ago

Regarding the nvidia SMI. You might want to look at https://github.com/wookayin/gpustat/ . Its python implementation that also uses nivida-smi do show a top like interface. There is also a comment that running nvidia-smi in deamon mode speeds up the query.

time gpustat [0] GeForce 940MX | 42'C, 0 % | 3 / 2004 MB | root(3M) real 0m0,076s user 0m0,055s sys 0m0,017s

i guess usuing the same way this tool querys the data or importing the code into this project would make sense. the gpustat call above is whithout a smi deamon.

aristocratos commented 3 years ago

@derpeter Looks like gpustat is using https://pypi.org/project/pynvml instead of actually calling nvidia-smi, which would explain why it's a lot faster.

Using pynvml could definitively be a solution but would put the responsibility to the user to make sure the pynvml module is installed.

Or maybe importing gpustat would be a better approach when it already has perfectly fine working collection methods for the gpu.

tom-doerr commented 3 years ago

I just ran the nvidia-smi commands @tomekziel ran and they are much faster for me:

~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 36 %, 60, 106.56 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 49, 5.68 W, 139 MHz, 405 MHz

real    0m0,022s
user    0m0,001s
sys 0m0,011s
~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 37 %, 60, 109.20 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 48, 5.68 W, 139 MHz, 405 MHz

real    0m0,022s
user    0m0,001s
sys 0m0,011s
~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 36 %, 60, 114.37 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 49, 5.40 W, 139 MHz, 405 MHz

real    0m0,023s
user    0m0,001s
sys 0m0,012s

NVIDIA-SMI 435.21
Driver Version: 435.21
CUDA Version: 10.1

derpeter commented 3 years ago

I can confirm that

time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]

real 0m0,027s user 0m0,004s sys 0m0,019s

aristocratos commented 3 years ago

Could this be because of active gpu usage vs idle gpu maybe?

aristocratos commented 3 years ago

Would be interesting to see what result @tomekziel gets from running time gpustat, if those results are above 500ms it might be easier to just use nvidia-smi instead.

tomekziel commented 3 years ago

I looks like nvidia-smi daemon is responsible for such a difference

user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 15 %, 11019 MiB, 0 MiB, 0 %, 43, 29.26 W, 1350 MHz, 7000 MHz

real    0m0,646s
user    0m0,004s
sys     0m0,578s
user@machine:~$ time gpustat
machine    Sun Nov 15 16:44:38 2020  450.80.02
[0] GeForce RTX 2080 Ti | 43'C,   0 % |     0 / 11019 MB |

real    0m0,743s
user    0m0,100s
sys     0m0,582s

but then

user@machine:~$ sudo nvidia-smi daemon
user@machine:~$

and voila:

user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 29 %, 11019 MiB, 0 MiB, 0 %, 45, 56.54 W, 1350 MHz, 7000 MHz

real    0m0,014s
user    0m0,000s
sys     0m0,009s

user@machine:~$ time gpustat
machine    Sun Nov 15 16:44:56 2020  450.80.02
[0] GeForce RTX 2080 Ti | 45'C,   0 % |     0 / 11019 MB |

real    0m0,115s
user    0m0,090s
sys     0m0,020s

tom-doerr commented 3 years ago

Just tested it under 100% GPU load to see if that influences performance. I get similar times as before:

~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 37 %, 11175 MiB, 10356 MiB, 100 %, 67, 280.72 W, 1885 MHz, 5508 MHz
GeForce GTX 1060 6GB, 40 %, 6078 MiB, 5479 MiB, 100 %, 64, 121.51 W, 1746 MHz, 3802 MHz

real    0m0,026s
user    0m0,001s
sys 0m0,014s
~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 42 %, 11175 MiB, 10356 MiB, 100 %, 68, 286.99 W, 1885 MHz, 5508 MHz
GeForce GTX 1060 6GB, 45 %, 6078 MiB, 5479 MiB, 100 %, 69, 119.61 W, 1733 MHz, 3802 MHz

real    0m0,021s
user    0m0,001s
sys 0m0,010s
~ $  time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 42 %, 11175 MiB, 10356 MiB, 100 %, 68, 275.81 W, 1873 MHz, 5508 MHz
GeForce GTX 1060 6GB, 45 %, 6078 MiB, 5479 MiB, 100 %, 69, 121.15 W, 1733 MHz, 3802 MHz

real    0m0,025s
user    0m0,000s
sys 0m0,010s

aristocratos commented 3 years ago

Looks like calling nvidia-smi directly is actually faster then gpustat regardless of nvidia-smi daemon running or not. Will have to put in a notice about starting nvidia-smi daemon for better gpu stat responsiveness, but the only actual side effect of not having started it would just be the stats lagging behind one update interval.

maxitromer commented 3 years ago

Hi @aristocratos !

Thanks for the amazing work with bpytop (and with bashtop too)!

I'm a loyal user, your app is way better in any aspect, period.

Can I ask if this will work for multiple GPUs?

I'm and AMD user with 3 GPUs using a multi monitor setup and I will love to see all the info in bpytop.

Thanks in advance for your work!

aristocratos commented 3 years ago

@maxitromer

Can I ask if this will work for multiple GPUs?

I don't know if there is a good way to fit stats for multiple gpus at the same time, but could possibly have it be switchable between gpus (kinda like how the network interfaces switch work).

maxitromer commented 3 years ago

@aristocratos I think a "switchable between gpus" option is perfect.

I'd love some ideas on how to populate the larger views.

alt text

Maybe for this view you could show a list with all the GPUs.

Also, on that image I'm missing the Memory Temperature and the Memory Voltage, both required when you do overclocking and undervolting.

I don't know if you could get and display that info.

drazil100 commented 3 years ago

Out of curiosity do you have a plan on how to support the gpu section in themes? I realize it's a bit early to ask this but I am curious.

aristocratos commented 3 years ago

@drazil100 It will get a dedicated section in the theme files like the other boxes and will default to the values of the cpu box and graphs if it's missing.

jiwidi commented 3 years ago

Hi! Did this gpu support end up being implemented?

aristocratos / bpytop

[REQUEST] Add GPU monitoring #26