Open tom-doerr opened 4 years ago
Nvidia cards can probably be worked in since https://github.com/wookayin/gpustat could be used to poll data.
AMD is probably possible too with https://github.com/Ricks-Lab/amdgpu-utils but requires more user setup.
I haven't found any good candidate for integrated intel graphics which I would like, to have proper GPU monitoring support.
I however don't have either an nvidia card or amd card to test with, so it's a bit tricky doing it blind.
I would love to see this feature added! I thought I'd contribute gpu-ls
output for my AMD card:
Detected GPUs: AMD: 1
AMD: amdgpu version: 19.1.0-2
AMD: Wattman features enabled: 0xffffffff
Warning: Error reading parameter: mem_loading, disabling for this GPU: 0
1 total GPUs, 1 rw, 0 r-only, 0 w-only
Card Number: 0
Vendor: AMD
Readable: True
Writable: True
Compute: False
GPU UID: 02151af2e48a2184
Device ID: {'device': '0x687f', 'subsystem_device': '0x2388', 'subsystem_vendor': '0x148c', 'vendor': '0x1002'}
Decoded Device ID: Vega 10 XL/XT [Radeon RX Vega 56/64]
Card Model: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c3)
Display Card Model: Vega 10 XL/XT [Radeon RX Vega 56/64]
PCIe ID: 0a:00.0
Link Speed: 8.0 GT/s PCIe
Link Width: 16
##################################################
Driver: amdgpu
vBIOS Version: 111
Compute Platform: None
GPU Type: PStates
HWmon: /sys/class/drm/card0/device/hwmon/hwmon3
Card Path: /sys/class/drm/card0/device
System Card Path: /sys/devices/pci0000:00/0000:00:03.1/0000:08:00.0/0000:09:00.0/0000:0a:00.0
##################################################
Current Power (W): 38.000
Power Cap (W): 185.000
Power Cap Range (W): [0, 277]
Fan Enable: 0
Fan PWM Mode: [2, 'Dynamic']
Fan Target Speed (rpm): 630
Current Fan Speed (rpm): 630
Current Fan PWM (%): 15
Fan Speed Range (rpm): [900, 3800]
Fan PWM Range (%): [0, 100]
##################################################
Current GPU Loading (%): 87
Current Memory Loading (%): None
Current GTT Memory Usage (%): 8.045
Current GTT Memory Used (GB): 0.642
Total GTT Memory (GB): 7.984
Current VRAM Usage (%): 20.886
Current VRAM Used (GB): 1.668
Total VRAM (GB): 7.984
Current Temps (C): {'edge': 52.0, 'junction': 57.0, 'mem': 52.0}
Critical Temps (C): {'edge': 85.0, 'junction': 105.0, 'mem': 95.0}
Current Voltages (V): {'vddgfx': 856}
Vddc Range: ['800mV', '1200mV']
Current Clk Frequencies (MHz): {'mclk': 167.0, 'sclk': 1359.0}
Current SCLK P-State: [0, '852Mhz']
SCLK Range: ['852MHz', '2400MHz']
Current MCLK P-State: [0, '167Mhz']
MCLK Range: ['167MHz', '1500MHz']
Power Profile Mode: 0-BOOTUP_DEFAULT
Power DPM Force Performance Level: auto
If @tom-doerr can contribute outputs for NV cards, I think it is doable. I certainly wouldn't mind testing an experimental branch for rickslab-gpu-utils
-based monitoring support, and even contributing a little. With regards to Intel iGPUs, I believe their users have less motives for active monitoring anyway, so it can be delayed for now.
The amdgpu
driver provides a hwmon interface (see https://www.kernel.org/doc/html/latest/gpu/amdgpu.html#hwmon-interfaces) which is also available in sensors
command. The GPU utilization percentage can be read from /sys/class/drm/cardX/device/gpu_busy_percent
.
That makes it a lot more viable, since https://github.com/Guillermo-C-A/Hwmon-python could be used to extract that information.
I will have nvidia gpu to test with in about a month, but without an actual amd gpu it's unlikely bpytop will have amd support, unless somebody else who can also test it writes that implementation.
I have an amd gpu in my machine. If you wanted to play around with GPU monitoring in a feature branch or something I would be happy to try it on my system. Wouldn't be able to write anything for it unfortunately.
My gpu is a little old though (Radeon HD 7730M). Not sure if that would make any kind of difference whatsoever.
My biggest concern is where you would even PUT gpu monitoring. The screen space is already so full of content and adding more would have to push something else into a smaller box. Everything is already so perfectly spaced though xD
If you wanted to play around with GPU monitoring in a feature branch or something I would be happy to try it on my system.
I appreciate the offer but would be very slow and annoying to not have live data to adapt to when coding it.
My biggest concern is where you would even PUT gpu monitoring
The idea for the design would be too add a gpu box sharing a third of the cpu box space and have similar design to the cpu except having vram usage, clocks and temperatures in the box were cores are shown for the cpu. Would also add a shortcut to toggle easily.
hey @aristocratos, I have a Radeon RX 5xx, would love to see this feature in BpyTOP, and would love to take a hack at implementing the amdgpu side of things.
The idea for the design would be too add a gpu box sharing a third of the cpu box space
Have you been able to get very far with this? I'd like to not have to deal with design if I could help it 😅 I'm going to dive into the code soon, but would appreciate any pointers on where I should look
@schaerfo @jorge-barreto That would be great! I'm still haven't gotten my new gpu yet, so haven't started writing it. Will probably be a couple of weeks till then.
But if you don't wanna have to deal with design I would recommend you wait till I have finished the nvidia portion and the draw functions (which should be agnostic to amd/nvidia and possibly intel cards).
If you want I can create a dummy collection function for amd when I create the nvidia function, then you could just fill in the missing pieces.
I'm not sure yet though what the best way to collect amd stats is. https://github.com/Guillermo-C-A/Hwmon-python is one way, but maybe it's possible to collect the same info from /sys
or /proc
if the amd drivers populates those?
That way we could avoid adding more dependencies. I'm gonna investigate what info that can be collected through /sys
and /proc
on the nvidia side when I get the new card.
@aristocratos
But if you don't wanna have to deal with design I would recommend you wait till I have finished the nvidia portion and the draw functions (which should be agnostic to amd/nvidia and possibly intel cards).
That would make things a lot simpler :+1:
I'm not sure yet though what the best way to collect amd stats is. https://github.com/Guillermo-C-A/Hwmon-python is one way, but maybe it's possible to collect the same info from
/sys
or/proc
if the amd drivers populates those?
All hwmon information can also simply be read from /sys/class/drm/card*/device/hwmon/hwmon*
All hwmon information can also simply be read from /sys/class/drm/card/device/hwmon/hwmon
That's great to hear, I'm hoping this is the same for nvidia.
Anyways, I will update this issue when I start working on the gpu functions.
@aristocratos i've gone ahead and done a mockup of the GPU box placement. here's a picture:
i'll be working on the GpuCollector class this weekend. you can find my code here.
@jorge-barreto This placement would be problematic considering the info box that pops up when pressing enter on a process takes up 7 lines in height, and would with this placement in a smaller terminal height not leave any room for the process list. I think the only possible placement is in a split (maybe a third) of the cpu box. But it will get crowded on smaller terminal sizes wherever it is placed.
i'll be working on the GpuCollector class this weekend. you can find my code here.
Great! I'm guessing the prioritized stats to gather would be gpu usage, temp, core clock, vram size, vram usage, vram clock and gpu model name maybe, are there any other important ones you can think of?
I'm thinking design wise of a clone of the cpu box, but with the previous mentioned stats in the smaller box instead of cores.
@aristocratos The picture above (and below) is in a 80x24 terminal, and I see what you mean about the process pop up. The design is ultimately up to you, but I worry about how much CpuBox we will have in the smaller terminal if we cut a third of its box's width. In the picture, you can see I lose some of my cores:
Perhaps we could consider hiding the GpuBox when the process detail view is in effect?
Either way I'm not super concerned about the design. Though, I may consider keeping a personal build with the GpuBox in my preferred location.
-- As far as stats go, I currently had this list:
We can deprioritize whatever of those as the need for space dictates. I'm not totally sure on how to get the VRAM clock, or exactly what it is.
The GPU model name is an interesting one:
For me, lspci -v | grep VGA
reports Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X/590]
. I guess we could truncate to Radeon RX? But given the amount of VGA compatible controllers
out there, I'm not totally sure how much useful parsing we could do. What're your thoughts on this?
@jorge-barreto
The design is ultimately up to you, but I worry about how much CpuBox we will have in the smaller terminal if we cut a third of its box's width. In the picture, you can see I lose some of my cores:
You might have a good point there, I also realized the alignment of the "Menu", "mode:", clock and battery meter would be a nightmare to get in a good place when looking at the mock up.
Actually I think I could make it so it automatically moves between the two placements depending on the size of the terminal.
I'm pretty sure MCLK is the VRAM clock but in amd naming.
And I'm guessing SCLK ≈ core clock for amd?
I'm not quite sure what GTT Usage is though?
Any stat could be added and then simply be shown/hidden by sizing constraints.
The GPU model name is an interesting one
You could possibly look through projects like https://pypi.org/project/gpuinfo/ or https://github.com/Ricks-Lab/amdgpu-utils too see what methods they use for model name. If they aren't using C code for stuff like that it could be a good reference.
Also if you're going the route of reading the /sys
subsystem manually to collect all the information, it could be good to compare the speeds with something like https://github.com/Guillermo-C-A/Hwmon-python I haven't checked if they are using C functions though apparently they don't. But if the data collection does impact the resource use a lot it could be a good idea to think about if there is any good pip available libraries to use.
@aristocratos
So, it looks like radeontop
reports MCLK
as 'Memory Clock', and SCLK
as 'Shader Clock'. I believe nvidia-smi
reports SCLK
as the 'Graphics (shader) Clock', judging by this man page. It also reports the 'Memory Clock' and the 'SM Clock', which might be an NVIDIA specific thing.
I'm not 100% on what GTT Usage
is, either. Some searching suggests that GTT
is the Graphics Translation Table, which "exposes a linear buffer from multiple RAM pages". I'm guessing that's not AMD specific? But you'll have to let me know what you find, because I can't really find any mentions of it in the context of NVIDIA.
I'll definitely look through those projects to see how it is they're grabbing the model name, and how they implement grabbing the actual info. I'll take a closer look at the TimeIt class to try and do some benchmarks, too. It's not immediately obvious to me how to use it, but I haven't really looked into it.
Update -- All of the linked projects simply use os.open
, except for gpuinfo, which seems to directly use nvidia-smi
. For what it's worth, I think nvidia-smi
does ship with NVIDIA's Linux drivers, but the same is not clear for other platforms.
@aristocratos This GPUmodule seems to have accomplished most of what we might need, but is also a 2000+ line file. I am guessing we do not want to bring in that much bloat into BpyTOP?
Also, it looks like name parsing might be best accomplished by matching the pid found in /sys/class/drm/card0/subsystem_device
to the relevant entry in the pci.ids
files.
@jorge-barreto
I'll take a closer look at the TimeIt class to try and do some benchmarks, too. It's not immediately obvious to me how to use it, but I haven't really looked into it.
Usage is just TimeIt.start("testing gpu collect")
at start and then TimeIt.stop("testing gpu collect")
will print 'testing gpu collect completed in X.XXXXX seconds
to the error.log
, need to have the loglevel set to debug in the config though.
I'm not 100% on what GTT Usage is, either. Some searching suggests that GTT is the Graphics Translation Table, which "exposes a linear buffer from multiple RAM pages". I'm guessing that's not AMD specific?
We could probably skip that for now then and focus on stats that would exist for both amd and nvidia, would also make the draw function a lot more straight forward. I'm guessing the top priority stats for most people would be gpu usage, temperature, fan speed, memory usage and core(shader)/memory clocks (for people overclocking).
Update -- All of the linked projects simply use os.open,
That's pretty telling that the direct approach shouldn't be an issue then.
For what it's worth, I think nvidia-smi does ship with NVIDIA's Linux drivers, but the same is not clear for other platforms.
I haven't really investigated if the nvidia drivers populate /sys
or what other methods could be used, but would really like to avoid having to run an external binary at every update.
This GPUmodule seems to have accomplished most of what we might need, but is also a 2000+ line file. I am guessing we do not want to bring in that much bloat into BpyTOP?
Yeah, would like to avoid it if you're up for the challenge?
Also, it looks like name parsing might be best accomplished by matching the pid found in /sys/class/drm/card0/subsystem_device to the relevant entry in the pci.ids files.
Oh, neat. That could possibly be a separate function in the GpuCollector class then, since it would only need to be run once and would get name of either amd or nvidia gpus.
@aristocratos
Usage is just TimeIt.start ...
Thank you!
We could probably ... focus on stats that would exist for both ...
I agree that it'd be simple to focus on what exists for both. I do wonder: do we want to give consideration to other vendors? And if so, how much? It'd be interesting to find out just how universal the hwmon
stuff is.
Yeah, would like to avoid it if you're up for the challenge?
Yeah, absolutely! I do work business hours during the week, but I don't think it should take me too long to get something basic out that we can start testing.
That could possibly be a separate function in the GpuCollector class ...
That makes sense. We could do something similar to how the CPU name is set now.
Anyway, I'll update this when I have a working demo.
I do wonder: do we want to give consideration to other vendors?
Getting some basic support for builtin graphics would be nice. Looking at /sys/class/drm/card0/
for my intel graphics I did find gt_act_freq_mhz
and others that could yield some stats, but maybe not as much as we would want. So if you find anything useful that might work for built-in graphics (amd or intel) that would be great. My thinking is still that the gpu box will be toggeable from the options, so if someone doesn't think the stats provided is useful, they could just turn it off.
@aristocratos
Ok, parsing that pci.ids
file was not as bad as I thought it'd be. One concern is that I am unsure if OSX has the relevant file, and if so, where it might be found. I was not able to find conclusive evidence one way or another. As a side note, it turns out that lspci
only messes up my particular model's name because the pci.ids
file actually has it mislabeled (lol). I've submitted a change request to their db.
I've got most of the collection done for Linux systems. At this point, I'm not sure how much of that will be portable over to OSX and FreeBSD. Hopefully, most of it :grin: Theoretically, I could do some testing with FreeBSD, but have no access to a mac for testing.
I'll be ironing out the GpuCollector class over the next couple days. I've tried my best to be agnostic towards how the systems are laid out, but I'm sure we'll have to make some changes. Afterwards I'll dig into how the graphs and other nice visuals are made. Here's what we got so far:
@jorge-barreto
Here's what we got so far
Great work!
I'm not sure how much of that will be portable over to OSX and FreeBSD
OSX is a problem I'm gonna have to look into, hopefully something like @hacker1024 suggested for cpu temp in #119 could be used for GPU also.
I wasn't able to get a ton done this weekend. Mostly fiddled around with making some meters and representing the rest of the information I had been able to fetch last time.
I'd love some ideas on how to populate the larger views.
@jorge-barreto Haven't gotten my new gpu yet, was naively thinking I would be able to get a rtx 3080 at launch :( So haven't started coding anything yet.
Regarding design I'm thinking something along the lines of:
proc
cpu
taking up around 33% total widthIf any unused space is left in the small version, the gpu usage graph could be added below all values and adapt to the available height.
Any updates on this request? I'm anxiously awaiting this feature addition :)
@bjtho08 Still waiting for a rtx 3080 I ordered on release day, looks like it's gonna be a while. And don't wanna start coding it before I can get live data and figure out the best collection method for nvidia.
As for the macOS side, I have exams coming up so that might be a while (unless someone else does it of course).
One note maybe. On Linux kernel with proprietary binary NVIDIA driver there's no /sys/class/drm/cardX/device/hwmon
entry.
Typically NVIDIA provides their own utility called nvidia-smi
which I asume reads this using proprietary ABI.
@Slaviusz
Would have liked to avoid calling an external executable but suspected this was the case.
Looking at tools like https://github.com/pmav99/nvsmi it shouldn't be too much trouble, just a bit worried about possible slowdown since I haven't had the chance to test the speeds of nvidia-smi
yet.
@aristocratos
I did strace run of nvidia-smi and it seems some information can also be retrieved without nvidia-smi.
Again on a system with NVIDIA binary proprietary driver the list of cards can be found here:
/proc/driver/nvidia/gpus/<PCI_SLOT>
(in form of [domain:]bus:device.function)
Basic card information can be retrieved here:
$ cat /proc/driver/nvidia/gpus/<PCI_SLOT>/information
Model: GeForce GTX 1070 Ti
IRQ: 80
GPU UUID: GPU-c3c9859d-836d-5ed5-f300-0406fbb1dd6e
Video BIOS: 86.04.85.00.a4
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:2d:00.0
Device Minor: 0
Blacklisted: No
Other than that it is all a bunch of IOCTL calls to NVIDIA kernel driver.
@Slaviusz
I think the only relevant info from there would be the model name so nvidia-smi
it is.
Would you mind running time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem
a couple of times and see what the results are?
Pick me! Pick me!
--format=csv
was required as well, here is the output
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 20 %, 11019 MiB, 0 MiB, 0 %, 60, 33.09 W, 1350 MHz, 7000 MHz
real 0m0,641s
user 0m0,000s
sys 0m0,581s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 28 %, 11019 MiB, 0 MiB, 0 %, 60, 33.30 W, 1350 MHz, 7000 MHz
real 0m0,644s
user 0m0,000s
sys 0m0,584s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 32 %, 11019 MiB, 0 MiB, 0 %, 60, 33.08 W, 1350 MHz, 7000 MHz
real 0m0,705s
user 0m0,001s
sys 0m0,645s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 34 %, 11019 MiB, 0 MiB, 0 %, 60, 32.85 W, 1350 MHz, 7000 MHz
real 0m0,638s
user 0m0,005s
sys 0m0,575s
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 19 %, 11019 MiB, 0 MiB, 0 %, 58, 32.04 W, 1350 MHz, 7000 MHz
real 0m0,648s
user 0m0,000s
sys 0m0,585s
@tomekziel You're the chosen one :)
real 0m0,641s
That's not looking really hopeful. Would you mind testing other output formats and if that's no different testing the different queries separately to see if there is any particular query that is taking more time?
Whatever command involving communication with (idle) GPU I try, does not go below 0.5 sys time.
Driver and tool version: NVIDIA System Management Interface -- v450.80.02
Uname 5.4.0-48-generic #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
@tomekziel
Whatever command involving communication with (idle) GPU I try, does not go below 0.5 sys time.
That's too bad, means I gonna have to run it in a separate thread which means the stats might lag behind and not be true realtime updates. Gonna do some more extensive testing when I get my gpu and start coding the collection functions.
Thanks for the help!
I will be happy with 1 sec updates. Still using glances gpu
, waiting for bpytop with GPU support 💕
Regarding the nvidia SMI. You might want to look at https://github.com/wookayin/gpustat/ . Its python implementation that also uses nivida-smi do show a top like interface. There is also a comment that running nvidia-smi in deamon mode speeds up the query.
time gpustat [0] GeForce 940MX | 42'C, 0 % | 3 / 2004 MB | root(3M) real 0m0,076s user 0m0,055s sys 0m0,017s
i guess usuing the same way this tool querys the data or importing the code into this project would make sense. the gpustat call above is whithout a smi deamon.
@derpeter
Looks like gpustat
is using https://pypi.org/project/pynvml instead of actually calling nvidia-smi
, which would explain why it's a lot faster.
Using pynvml
could definitively be a solution but would put the responsibility to the user to make sure the pynvml
module is installed.
Or maybe importing gpustat
would be a better approach when it already has perfectly fine working collection methods for the gpu.
I just ran the nvidia-smi
commands @tomekziel ran and they are much faster for me:
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 36 %, 60, 106.56 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 49, 5.68 W, 139 MHz, 405 MHz
real 0m0,022s
user 0m0,001s
sys 0m0,011s
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 37 %, 60, 109.20 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 48, 5.68 W, 139 MHz, 405 MHz
real 0m0,022s
user 0m0,001s
sys 0m0,011s
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 29 %, 11175 MiB, 2856 MiB, 36 %, 60, 114.37 W, 1936 MHz, 5508 MHz
GeForce GTX 1060 6GB, 29 %, 6078 MiB, 4 MiB, 0 %, 49, 5.40 W, 139 MHz, 405 MHz
real 0m0,023s
user 0m0,001s
sys 0m0,012s
NVIDIA-SMI 435.21
Driver Version: 435.21
CUDA Version: 10.1
I can confirm that
time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
real 0m0,027s user 0m0,004s sys 0m0,019s
Could this be because of active gpu usage vs idle gpu maybe?
Would be interesting to see what result @tomekziel gets from running time gpustat
, if those results are above 500ms it might be easier to just use nvidia-smi
instead.
I looks like nvidia-smi daemon
is responsible for such a difference
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 15 %, 11019 MiB, 0 MiB, 0 %, 43, 29.26 W, 1350 MHz, 7000 MHz
real 0m0,646s
user 0m0,004s
sys 0m0,578s
user@machine:~$ time gpustat
machine Sun Nov 15 16:44:38 2020 450.80.02
[0] GeForce RTX 2080 Ti | 43'C, 0 % | 0 / 11019 MB |
real 0m0,743s
user 0m0,100s
sys 0m0,582s
but then
user@machine:~$ sudo nvidia-smi daemon
user@machine:~$
and voila:
user@machine:~$ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce RTX 2080 Ti, 29 %, 11019 MiB, 0 MiB, 0 %, 45, 56.54 W, 1350 MHz, 7000 MHz
real 0m0,014s
user 0m0,000s
sys 0m0,009s
user@machine:~$ time gpustat
machine Sun Nov 15 16:44:56 2020 450.80.02
[0] GeForce RTX 2080 Ti | 45'C, 0 % | 0 / 11019 MB |
real 0m0,115s
user 0m0,090s
sys 0m0,020s
Just tested it under 100% GPU load to see if that influences performance. I get similar times as before:
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 37 %, 11175 MiB, 10356 MiB, 100 %, 67, 280.72 W, 1885 MHz, 5508 MHz
GeForce GTX 1060 6GB, 40 %, 6078 MiB, 5479 MiB, 100 %, 64, 121.51 W, 1746 MHz, 3802 MHz
real 0m0,026s
user 0m0,001s
sys 0m0,014s
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 42 %, 11175 MiB, 10356 MiB, 100 %, 68, 286.99 W, 1885 MHz, 5508 MHz
GeForce GTX 1060 6GB, 45 %, 6078 MiB, 5479 MiB, 100 %, 69, 119.61 W, 1733 MHz, 3802 MHz
real 0m0,021s
user 0m0,001s
sys 0m0,010s
~ $ time nvidia-smi --query-gpu=gpu_name,fan.speed,memory.total,memory.used,utilization.gpu,temperature.gpu,power.draw,clocks.gr,clocks.mem --format=csv name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
name, fan.speed [%], memory.total [MiB], memory.used [MiB], utilization.gpu [%], temperature.gpu, power.draw [W], clocks.current.graphics [MHz], clocks.current.memory [MHz]
GeForce GTX 1080 Ti, 42 %, 11175 MiB, 10356 MiB, 100 %, 68, 275.81 W, 1873 MHz, 5508 MHz
GeForce GTX 1060 6GB, 45 %, 6078 MiB, 5479 MiB, 100 %, 69, 121.15 W, 1733 MHz, 3802 MHz
real 0m0,025s
user 0m0,000s
sys 0m0,010s
Looks like calling nvidia-smi
directly is actually faster then gpustat
regardless of nvidia-smi daemon
running or not.
Will have to put in a notice about starting nvidia-smi daemon
for better gpu stat responsiveness, but the only actual side effect of not having started it would just be the stats lagging behind one update interval.
Hi @aristocratos !
Thanks for the amazing work with bpytop (and with bashtop too)!
I'm a loyal user, your app is way better in any aspect, period.
Can I ask if this will work for multiple GPUs?
I'm and AMD user with 3 GPUs using a multi monitor setup and I will love to see all the info in bpytop.
Thanks in advance for your work!
@maxitromer
Can I ask if this will work for multiple GPUs?
I don't know if there is a good way to fit stats for multiple gpus at the same time, but could possibly have it be switchable between gpus (kinda like how the network interfaces switch work).
@aristocratos I think a "switchable between gpus" option is perfect.
I'd love some ideas on how to populate the larger views.
Maybe for this view you could show a list with all the GPUs.
Also, on that image I'm missing the Memory Temperature and the Memory Voltage, both required when you do overclocking and undervolting.
I don't know if you could get and display that info.
Out of curiosity do you have a plan on how to support the gpu section in themes? I realize it's a bit early to ask this but I am curious.
@drazil100 It will get a dedicated section in the theme files like the other boxes and will default to the values of the cpu box and graphs if it's missing.
Hi! Did this gpu support end up being implemented?
Is your feature request related to a problem? Please describe. I'm always frustrated when I have to open another window to monitor the usage of my GPUs.
Describe the solution you'd like I would love to see the GPU utilization and memory usage in bpytop when bpytop detects dedicated GPUs. Users that made the effort to install a dedicated GPU or bought a computer with a dedicated GPU likely care a lot about the GPU usage.
Describe alternatives you've considered An alternative would be to open
nvtop
or runwatch nvidia-smi
.