Closed jerome3o closed 1 year ago
showmemuse works by reading a sysfs file. See /sys/class/drm/card0/device/mem_busy_percent
(or card1 or whatever) on your system with an AMD gpu :)
I don't think we can/should change that functionality completely.
Can you use rocm-smi --json --showmeminfo vram
and calculate the usage % from there? That would be roughly equivalent to your PR.
Ref: https://github.com/RadeonOpenCompute/rocm_smi_lib/blob/f8882d74d8749e2ad788184d624167cc326d4c2c/src/rocm_smi.cc#LL2733C40-L2733C40 https://github.com/RadeonOpenCompute/rocm_smi_lib/blob/f8882d74d8749e2ad788184d624167cc326d4c2c/src/rocm_smi_device.cc#LL119C37-L119C37
p.s. the prometheus exporter is neat!
This is a fix for getting accurate VRAM readings when the --json flag
Before the change, when running
rocm-smi --json --showmemuse
i get:After this fix i get:
I have only tested locally with 2 RX6800s and: AMD ROCm System Management Interface | ROCM-SMI version: 1.4.1 | Kernel version: 5.18.13
If interested, I needed this to set up a prometheus exporter in python: https://github.com/jerome3o/rocm-prom-metrics
Let me know if there is anything I can do to help get this merged :)