Closed theofficialgman closed 2 years ago
Can you try the latest source code in draft
branch and share a screenshot if there is no error?
udev may be used for detecting device information but I do not know which details can be get on ARM systems.
Can you share output of these commands?
udevadm info /dev/mmcblk0
udevadm info /sys/class/net/enx0050b62942ef
udevadm info /sys/class/net/usb0
udevadm info /sys/devices/card.0
udevadm info /sys/devices/card.0 --attribute-walk
I think everything works fine. do you still need the output from the commands?
btw, I really did prefer the deb releases of this software (since they fully handle the dependencies needed). especially since you now have to open the software once via the command line before the .desktop file is generated... I'd like to include this as part of pi-apps but thats a blocking issue. the .desktop file needs be generated immediately on installation
udevadm info /dev/mmcblk0
P: /devices/sdhci-tegra.0/mmc_host/mmc0/mmc0:aaaa/block/mmcblk0
N: mmcblk0
S: disk/by-id/mmc-SN512_0xfff7bXXX
S: disk/by-path/platform-sdhci-tegra.0
E: DEVLINKS=/dev/disk/by-id/mmc-SN512_0xfff7bXXX /dev/disk/by-path/platform-sdhci-tegra.0
E: DEVNAME=/dev/mmcblk0
E: DEVPATH=/devices/sdhci-tegra.0/mmc_host/mmc0/mmc0:aaaa/block/mmcblk0
E: DEVTYPE=disk
E: ID_DRIVE_FLASH_SD=1
E: ID_DRIVE_MEDIA_FLASH_SD=1
E: ID_NAME=SN512
E: ID_PART_TABLE_TYPE=gpt
E: ID_PART_TABLE_UUID=109b0c2b-5858-9090-8081-82831011XXXX
E: ID_PATH=platform-sdhci-tegra.0
E: ID_PATH_TAG=platform-sdhci-tegra_0
E: ID_SERIAL=0xfff7b17b
E: MAJOR=179
E: MINOR=0
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=6400810
garrett@garrett-usb:~$ udevadm info /sys/class/net/enx0050b62942ef
P: /devices/70090000.xusb/usb1/1-1/1-1.3/1-1.3:1.0/net/enx0050b62942ef
E: DEVPATH=/devices/70090000.xusb/usb1/1-1/1-1.3/1-1.3:1.0/net/enx0050b62942ef
E: ID_BUS=usb
E: ID_MM_CANDIDATE=1
E: ID_MODEL=AX88179
E: ID_MODEL_ENC=AX88179
E: ID_MODEL_FROM_DATABASE=AX88179 Gigabit Ethernet
E: ID_MODEL_ID=1790
E: ID_NET_DRIVER=ax88179_178a
E: ID_NET_LINK_FILE=/lib/systemd/network/99-default.link
E: ID_NET_NAME_MAC=enx0050b62942ef
E: ID_OUI_FROM_DATABASE=GOOD WAY IND. CO., LTD.
E: ID_PATH=platform-70090000.xusb-usb-0:1.3:1.0
E: ID_PATH_TAG=platform-70090000_xusb-usb-0_1_3_1_0
E: ID_REVISION=0100
E: ID_SERIAL=ASIX_Elec._Corp._AX88179_000050B62942EF
E: ID_SERIAL_SHORT=000050B6294XXX
E: ID_TYPE=generic
E: ID_USB_CLASS_FROM_DATABASE=Vendor Specific Class
E: ID_USB_DRIVER=ax88179_178a
E: ID_USB_INTERFACES=:ffff00:
E: ID_USB_INTERFACE_NUM=00
E: ID_USB_SUBCLASS_FROM_DATABASE=Vendor Specific Subclass
E: ID_VENDOR=ASIX_Elec._Corp.
E: ID_VENDOR_ENC=ASIX\x20Elec.\x20Corp.
E: ID_VENDOR_FROM_DATABASE=ASIX Electronics Corp.
E: ID_VENDOR_ID=0b95
E: IFINDEX=18
E: INTERFACE=enx0050b62942ef
E: SUBSYSTEM=net
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/enx0050b62942ef /sys/subsystem/net/devices/enx0050b62942ef
E: TAGS=:systemd:
E: USEC_INITIALIZED=27614956460
garrett@garrett-usb:~$ udevadm info /sys/class/net/usb0
P: /devices/700d0000.xudc/gadget/net/usb0
E: DEVPATH=/devices/700d0000.xudc/gadget/net/usb0
E: DEVTYPE=gadget
E: ID_MM_CANDIDATE=1
E: ID_NET_DRIVER=g_ether
E: ID_NET_LINK_FILE=/lib/systemd/network/99-default.link
E: ID_PATH=platform-700d0000.xudc
E: ID_PATH_TAG=platform-700d0000_xudc
E: IFINDEX=8
E: INTERFACE=usb0
E: NM_UNMANAGED=1
E: SUBSYSTEM=net
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/usb0
E: TAGS=:systemd:
E: USEC_INITIALIZED=7531640
garrett@garrett-usb:~$ udevadm info /sys/devices/card.0
Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
garrett@garrett-usb:~$ udevadm info /sys/devices/card.0 --attribute-walk
Unknown device, absolute path in /dev/ or /sys expected.
I replaced a few numbers with XXX to hide any of my actual serial numbers/unique IDs
Vendor-model information bug for MMC devices is fixed.
The issue will be closed.
About deb packages: A debian branch is planned (in a week) for some application stores. But deb files will not be added. They can be generated by maintainers from the source code by using installation scripts. Is it useful for you? How the application addition/update process of the pi-apps work?
Update: pyudev will not be used for getting device venfor-model. Current method is faster.
Can you share file list of this folder if you have a NVMe or M2 SSD on your computer (not Switch)?:
/sys/class/block/[parent_disk_name]/device/device/
Note: There are two device
folders.
Is there any modalias file?
About deb packages: A debian branch is planned (in a week) for some application stores. But deb files will not be added. They can be generated by maintainers from the source code by using installation scripts. Is it useful for you? How the application addition/update process of the pi-apps work?
The implementation used for a lot of apps at pi-apps is automatic obtaining of .deb or .tar.gz files from github releases of software. We have a automated github action script that checks for new releases for our apps from their github repos, this create a PR to pi-apps, which when merged goes out to users and prompts them to update their installed app. So this relies on a .deb already being available on the repo. This is the preferred method rather than just making an install script to build the application
it might be easier to just see via the original PR: https://github.com/Botspot/pi-apps/pull/1629/files the version variable was automatically updated by github actions when a new release was made via github releases on this repo. now that no github releases are made, its not as simple to update our script, which is how users get updates
Can you share file list of this folder if you have a NVMe or M2 SSD on your computer (not Switch)?:
/sys/class/block/[parent_disk_name]/device/device/
Note: There are twodevice
folders.Is there any modalias file?
ls -1 /sys/class/block/nvme0n1/device/device
aer_dev_correctable
aer_dev_fatal
aer_dev_nonfatal
ari_enabled
broken_parity_status
class
config
consistent_dma_mask_bits
current_link_speed
current_link_width
d3cold_allowed
device
dma_mask_bits
driver
driver_override
enable
firmware_node
irq
link
local_cpulist
local_cpus
max_link_speed
max_link_width
modalias
msi_bus
msi_irqs
numa_node
nvme
pools
power
power_state
remove
rescan
reset
reset_method
resource
resource0
revision
subsystem
subsystem_device
subsystem_vendor
uevent
vendor
cat /sys/class/block/nvme0n1/device/device/modalias
pci:v0000144Dd0000A808sv0000144Dsd0000A801bc01sc08i02
I will try to prepare deb package for the newer releases. There are other python packages which are packaged for debian based systems. Their project structure (files/folders) are similar to structure of this application. I will try to learn how this conversion (python package structure to deb package structure) is made for every deb package.
I have updated the vendor-model detection codes for all devices. This will be used in newer versions. It is more reliable and more devices are supported now and maintenance is easier. Also a lot of ARM CPUs are supported. hwdata dependency is removed. After code changes for ARM devices/USB network cards some device model detection codes did not work (for example: some devices on QEMU virtual machines). This is fixed.
I tested the code for several devices/virtual machines. Can you make a quick test if there are bugs for device vendor/models only (on CPU, Disk, Network, GPU tabs)? https://github.com/hakandundar34coding/system-monitoring-center/archive/refs/heads/master.zip
Currently there is only mesa-utils dependency which is not a Python package. The other dependencies are already installed on many devices (such as dmidecode, iproute2, util-linux, etc.). This dependency may be removed if there is a way to get graphics card memory without it. Also GPU frequency, load etc. will be added.
A separate issue may be used for this features if you want to provide additional information. You may be tired because of this bug/device information/feature traffic.
I have updated the vendor-model detection codes for all devices. This will be used in newer versions. It is more reliable and more devices are supported now and maintenance is easier. Also a lot of ARM CPUs are supported. hwdata dependency is removed. After code changes for ARM devices/USB network cards some device model detection codes did not work (for example: some devices on QEMU virtual machines). This is fixed.
my network card name and info no longer show up
Traceback (most recent call last):
File "/home/garrett/.local/lib/python3.6/site-packages/systemmonitoringcenter/src/Network.py", line 169, in network_initial_func
device_vendor_name, device_model_name, _, _ = Performance.performance_get_device_vendor_model_func(modalias_output)
File "/home/garrett/.local/lib/python3.6/site-packages/systemmonitoringcenter/src/Performance.py", line 396, in performance_get_device_vendor_model_func
with open(udev_hardware_database_dir + "20-usb-vendor-model.hwdb") as reader:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb'
I tested the code for several devices/virtual machines. Can you make a quick test if there are bugs for device vendor/models only (on CPU, Disk, Network, GPU tabs)? https://github.com/hakandundar34coding/system-monitoring-center/archive/refs/heads/master.zip
CPU, Disk, and GPU all still seem fine
this file exists but is located at a different directory on my system /lib/udev/hwdb.d/20-usb-vendor-model.hwdb
. all ubuntu and debian distros use this location
same goes for the pcie model file
if I change the udev_hardware_database_dir
variable location all works fine. if your system has a symlink from /lib/udev/hwdb.d
to /usr/lib/udev/hwdb.d/
I would just consider using the first one. or just check for both directories like you have done before
On Ubuntu 18 the directory is /lib/udev/hwdb.d/
but on several newer Linux distributions and Ubuntu 21 it is /usr/lib/udev/hwdb.d/
.
The bug is fixed, the code is updated and tested on Ubuntu 18 on virtual machine.
On Ubuntu 18 the directory is
/lib/udev/hwdb.d/
but on several newer Linux distributions and Ubuntu 21 it is/usr/lib/udev/hwdb.d/
.The bug is fixed, the code is updated and tested on Ubuntu 18 on virtual machine.
this is not correct. it is still /lib/udev/hwdb.d/
on newer ubuntu. what is different is on newer ubuntu there is an additional symlink from /lib
to /usr/lib
you can see in the jammy files list for the udev
package (where the 20-usb-vendor-model.hwdb
file comes from), the the location is still /lib/udev/hwdb.d/20-usb-vendor-model.hwdb
https://packages.ubuntu.com/jammy/amd64/udev/filelist
I'm not aware of any distros (yet) that exclusively use /usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb
with the /lib
location
your solution is fine though
this is not correct.
Here is an explanation for the change and about the folder link for compability: https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge/
I tried to prepare a deb package but there are problems. Do you know preparing deb packages for Python packages?
I tried to prepare a deb package but there are problems. Do you know preparing deb packages for Python packages?
Is there something wrong with the previous method you used for packaging the deb? I've never made one before so no
Project structure is changed for pip installation. There are some python packages which are packaged as deb packages by package maintainers. Some debian files are used for deb packaging without project structure modifications.
A separate branch may be prepared by using the old method but code updates will be applied to this branch manually. This method may be used as a temporary solution.
A .deb package is prepared by using the source code on the deb_for_stores branch and it can be downloaded. There are minor modifications on the source code, setup.py file and number of files for deb packaging. The structure in this branch is very similar to the one which was used before Python package structure.
A link for the releases page: https://github.com/hakandundar34coding/system-monitoring-center/releases
shouldn't you be adding back hwdata
as a dependency as well for the debian package?
Same code and database is used for the deb package version. Code changes are for checking some files (shortcut and GUI image links) and starting the application. hwdata
is not used for Python package and deb package versions of the application.
I made code changes for GPU tab (load/frequency, etc.). They are not uploaded yet. But nvidia-smi is not very fast for getting GPU information very frequently (<= 1 second).
nvidia-smi
can be used on desktop systems without installing an additional software if closed sourced drivers are installed (tested on a 12 year old GPU and a 1-2 year old GPU).
I do not know what is the elapsed time for getting this information on your N.Switch device.
#!/usr/bin/env python3
import subprocess, time
start_time=time.time()
gpu_tool_command = ["nvidia-smi", "--query-gpu=gpu_name,gpu_bus_id,driver_version,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used,temperature.gpu,clocks.current.graphics,clocks.max.graphics", "--format=csv"]
gpu_tool_output = (subprocess.check_output(gpu_tool_command, shell=False)).decode().strip().split("\n")
end_time=time.time()
print(end_time-start_time)
print(gpu_tool_output)
GPU information can be get for systems with Intel and Nvidia. But there is no a system with AMD GPU to test GPU information scripts.
Unfortunately nvidia-smi is only available on nvidia desktop gpus, not on the Nvidia Jetson/tegra platforms (which the Nintendo switch is one of).
I have a system with AMD and Intel graphics so I can test those for you.
There will be solutions for the N.Switch. nvidia-smi is used for desktop GPUs.
GPU tab is redesigned. mesa-utils dependency is removed and glarea widget is removed (code is not uploaded yet). There was some problems on some devices (for example some RB-Pi devices) when GPU tab is opened.
Can you share output of these commands on your system (AMD GPU)? You can change card0 to card1 if AMD gpu has different name.
grep . /sys/class/drm/card0/*
grep . /sys/class/drm/card0/device/*
You can also share the outputs of the same commands for the Intel GPU. Intel GPU on my system is very old and very limited performance information is provided in the gpu folder.
my intel and amd gpu's are on the same system. so intel is card0 and amd is card1
intel:
amd:
You shared a lot of useful information and feedbacks. You can send a pull request if you want to add your name in a contributors file.
There is no a big difference between usage informations of Intel HD 3000 and your GPU.
A lot of information fields will be empty for Intel GPUs. There is a tool (intel_gpu_top) but root privileges are required for running it. Also installation of this tool is required. It will not be used for getting information.
Support for AMD GPU load, frequency, etc. is added. Source code is not uploaded yet. There will be improvements for temperature and power information.
It will not be used but are there any AMD GPU monitoring/information tools (like nvidia-smi) installed on your system by the driver? For example: rocm-smi.
Can you share ouput of therse commands? Currently some of them are not required but I do not want to disturb you again.
grep . /sys/class/drm/card0/metrics/*
grep . /sys/class/drm/card1/device/fw_version/*
grep . /sys/class/drm/card1/device/hwmon/*
and sub-folders for temperature and power. For example: `grep . /sys/class/drm/card1/device/hwmon/hwmon1/* There are temp... files and power... files for these sensors.
‐--------------
I know you shared file list of gpu.0 folder for N.Switch but this time file contents will be get:
grep . /sys/devices/gpu.0/*
grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
grep . /sys/class/drm/card0/metrics/* grep . /sys/class/drm/card1/device/fw_version/* grep . /sys/class/drm/card1/device/hwmon/*
/sys/class/drm is an empty folder only containing one file version
grep . /sys/devices/gpu.0/* grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
grep . /sys/devices/gpu.0/*
/sys/devices/gpu.0/aelpg_enable:1
/sys/devices/gpu.0/aelpg_param:1000000 100 10000 2000 200
/sys/devices/gpu.0/allow_all:0
/sys/devices/gpu.0/blcg_enable:1
/sys/devices/gpu.0/comptag_mem_deduct:0
/sys/devices/gpu.0/counters:38523066 311638150
/sys/devices/gpu.0/counters_reset:38523066 311648246
/sys/devices/gpu.0/czf_bypass:0
grep: /sys/devices/gpu.0/devfreq: Is a directory
grep: /sys/devices/gpu.0/driver: Is a directory
/sys/devices/gpu.0/driver_override:(null)
/sys/devices/gpu.0/elcg_enable:1
/sys/devices/gpu.0/elpg_enable:1
/sys/devices/gpu.0/emc3d_ratio:750
/sys/devices/gpu.0/enable_3d_scaling:1
/sys/devices/gpu.0/fmax_at_vmin_safe:460800000
/sys/devices/gpu.0/force_idle:0
/sys/devices/gpu.0/freq_request:921600000
/sys/devices/gpu.0/gfxp_wfi_timeout_count:0
/sys/devices/gpu.0/gfxp_wfi_timeout_unit:sysclk
/sys/devices/gpu.0/gpu_powered_on:1
grep: /sys/devices/gpu.0/iommu_group: Is a directory
/sys/devices/gpu.0/is_railgated:no
/sys/devices/gpu.0/ldiv_slowdown_factor:0
/sys/devices/gpu.0/load:0
/sys/devices/gpu.0/max_timeslice_us:50000
/sys/devices/gpu.0/min_timeslice_us:1000
/sys/devices/gpu.0/modalias:of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b
/sys/devices/gpu.0/mscg_enable:0
grep: /sys/devices/gpu.0/of_node: Is a directory
/sys/devices/gpu.0/pd_max_batches:0
grep: /sys/devices/gpu.0/power: Is a directory
/sys/devices/gpu.0/ptimer_ref_freq:31250000
/sys/devices/gpu.0/ptimer_scale_factor:1.644736
/sys/devices/gpu.0/ptimer_src_freq:19200000
/sys/devices/gpu.0/railgate_delay:500
/sys/devices/gpu.0/railgate_enable:1
/sys/devices/gpu.0/slcg_enable:1
grep: /sys/devices/gpu.0/subsystem: Is a directory
/sys/devices/gpu.0/tpc_fs_mask:0x3
/sys/devices/gpu.0/tpc_pg_mask:0
/sys/devices/gpu.0/uevent:DRIVER=gk20a
/sys/devices/gpu.0/uevent:OF_NAME=gpu
/sys/devices/gpu.0/uevent:OF_FULLNAME=/gpu
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_0=nvidia,tegra210-gm20b
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_1=nvidia,gm20b
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_N=2
/sys/devices/gpu.0/uevent:MODALIAS=of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b
/sys/devices/gpu.0/user:0
grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
/sys/devices/gpu.0/devfreq/57000000.gpu/available_frequencies:76800000 153600000 230400000 307200000 384000000 460800000 537600000 614400000 691200000 768000000 844800000 921600000
/sys/devices/gpu.0/devfreq/57000000.gpu/available_governors:wmark_active nvhost_podgov userspace simple_ondemand
/sys/devices/gpu.0/devfreq/57000000.gpu/cur_freq:76800000
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/device: Is a directory
/sys/devices/gpu.0/devfreq/57000000.gpu/governor:nvhost_podgov
/sys/devices/gpu.0/devfreq/57000000.gpu/max_freq:768000000
/sys/devices/gpu.0/devfreq/57000000.gpu/min_freq:76800000
/sys/devices/gpu.0/devfreq/57000000.gpu/polling_interval:25
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/power: Is a directory
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/subsystem: Is a directory
/sys/devices/gpu.0/devfreq/57000000.gpu/target_freq:76800000
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: From : To
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: : 76800000 153600000 230400000 307200000 384000000 460800000 537600000 614400000 691200000 768000000 844800000 921600000 time(ms)
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:* 76800000: 0 38 0 0 101 0 0 0 0 0 0 0 323769
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 153600000: 70 0 47 0 0 3 0 0 0 0 0 0 143883
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 230400000: 50 28 0 23 0 0 1 0 0 0 0 0 167245
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 307200000: 17 0 11 0 3 0 0 0 0 0 0 0 79597
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 384000000: 1 54 44 5 0 3 0 0 0 0 0 0 14115
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 460800000: 1 0 0 3 2 0 0 0 0 0 0 0 7672
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 537600000: 0 0 0 0 1 0 0 0 0 0 0 0 58
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 614400000: 0 0 0 0 0 0 0 0 0 0 0 0 0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 691200000: 0 0 0 0 0 0 0 0 0 0 0 0 0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 768000000: 0 0 0 0 0 0 0 0 0 0 0 0 0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 844800000: 0 0 0 0 0 0 0 0 0 0 0 0 0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat: 921600000: 0 0 0 0 0 0 0 0 0 0 0 0 0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:Total transition : 506
and before you ask, the power
folder doesn't contain any useful info. there is no way to get power draw for this gpu.
and yes, load
/sys/devices/gpu.0/load
does scale from 0 to some value, I just am not doing anything right now. I'm not sure what the upper value is, its higher than 256. it might be a fraction of the current emc3d_ratio
actually I think load is from 0
to 1000
, 0 being 0%, 1000 being 100%
so it just gives a decimal place
I have updated the code for ARM and Tegra GPUs. GPU load will not be shown. 0 value instead of 1-2% when idle is very interesting. It may be processed instead of using it directly. There is a command sudo ~/tegrastats
and it may be used for controlling the value in the file by using watch -n 1 cat /sys/devices/gpu.0/load
There are some comments about GPU information about Tegra devices.
The first group of commands (in drm folder) was for the AMD GPU. I remember the content of the drm folder for the ARM device.
I updated the comment and used dashed line to split it for AMD and Tegra GPUs.
I want some command outputs very frequently. You can prefer not writing if it is tiring/boring for you. :) The sensor information for AMD GPUs will be fixed later. A contributor may share this information.
I have updated the code for ARM and Tegra GPUs. GPU load will not be shown. 0 value instead of 1-2% when idle is very interesting. It may be processed instead of using it directly. There is a command
sudo ~/tegrastats
and it may be used for controlling the value in the file by usingwatch -n 1 cat /sys/devices/gpu.0/load
There are some comments about GPU information about Tegra devices.
I'm already well aware of tegrastats and already cross referenced the values in tegrastats output compared to gpu.0/load, they are the same (well besides the 0 to 1000 scaling for load and 0% to 100% for tegrastats). GPU load of 0 is normal and expected because I was literally on a static image on the desktop, there is no load on the gpu in that scenario, its framebuffer is not changing, there is no animations, etc. For example, if tegrastats shows lets say 87%
gpu load, the gpu.0/load value will be something like 872
You can prefer not writing if it is tiring/boring for you. :
nah don't worry its not tiring, I'm just very busy the next few weeks. I will get back to you later on the output of the drm folder of my amd gpu later.
for amd gpu
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/metrics/*
grep: /sys/class/drm/card1/metrics/*: No such file or directory
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/fw_version/*
/sys/class/drm/card1/device/fw_version/asd_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ce_fw_version:0x0000008c
/sys/class/drm/card1/device/fw_version/dmcu_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/mc_fw_version:0x03b4dc40
/sys/class/drm/card1/device/fw_version/mec2_fw_version:0x000002da
/sys/class/drm/card1/device/fw_version/mec_fw_version:0x000002da
/sys/class/drm/card1/device/fw_version/me_fw_version:0x000000a7
/sys/class/drm/card1/device/fw_version/pfp_fw_version:0x000000fe
/sys/class/drm/card1/device/fw_version/rlc_fw_version:0x0000011e
/sys/class/drm/card1/device/fw_version/rlc_srlc_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/rlc_srlg_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/rlc_srls_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/sdma2_fw_version:0x0000003a
/sys/class/drm/card1/device/fw_version/sdma_fw_version:0x0000003a
/sys/class/drm/card1/device/fw_version/smc_fw_version:0x00171100
/sys/class/drm/card1/device/fw_version/sos_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ta_ras_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ta_xgmi_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/uvd_fw_version:0x01821000
/sys/class/drm/card1/device/fw_version/vce_fw_version:0x351a0300
/sys/class/drm/card1/device/fw_version/vcn_fw_version:0x00000000
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/hwmon/*
grep: /sys/class/drm/card1/device/hwmon/hwmon2: Is a directory
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/hwmon/hwmon2/*
grep: /sys/class/drm/card1/device/hwmon/hwmon2/device: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_enable:0
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_input:881
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_max:4500
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_min:0
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_target:881
/sys/class/drm/card1/device/hwmon/hwmon2/freq1_input:1233000000
/sys/class/drm/card1/device/hwmon/hwmon2/freq1_label:sclk
/sys/class/drm/card1/device/hwmon/hwmon2/freq2_input:300000000
/sys/class/drm/card1/device/hwmon/hwmon2/freq2_label:mclk
/sys/class/drm/card1/device/hwmon/hwmon2/in0_input:1018
/sys/class/drm/card1/device/hwmon/hwmon2/in0_label:vddgfx
/sys/class/drm/card1/device/hwmon/hwmon2/name:amdgpu
grep: /sys/class/drm/card1/device/hwmon/hwmon2/power: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/power1_average:22143000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_default:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_max:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_min:0
/sys/class/drm/card1/device/hwmon/hwmon2/power1_label:slowPPT
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1:0
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_enable:2
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_max:255
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_min:0
grep: /sys/class/drm/card1/device/hwmon/hwmon2/subsystem: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_crit:94000
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_crit_hyst:-273150
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_input:54000
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_label:edge
I will say, something seem to not be working correctly with gpu load in system monitoring center I am loading the gpu and it does not report much and it is prone to spikes.
compare that to radeontop where the load is constant
I looked at the code for radeontop (found on github) and its not using anything I'm familiar with, its definitely not using gpu_busy_percent to get the data, that is for certain
if I do a watch -n 0.1 cat /sys/class/drm/card1/device/gpu_busy_percent
I see that the percentage changes rapidly between 0 and 100 many times per second. so maybe a rolling average of a few data points is all that is needed to get a better estimate for load
I also tried the 1.12 deb and I don't see any load or frequency output on the tegra.
ok I debugged the code and found an error in the vendor id code
print(self.device_vendor_id)
if I print this, I see the ouput is Nvidia
, its not an id number.
this means the later if does not get entered:
# If selected GPU vendor is NVIDIA and selected GPU is used on an ARM system.
if self.device_vendor_id == "v000010DE" and gpu_device_path.startswith("/sys/devices/") == True:
my modalias file has nvidia written directly in it
of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b
so the vendor id returns Nvidia, you forgot to check for this when using it
It looks like measurement for GPU load (in the file in card folder) is performed for a very small time for the AMD GPU. Average is required. You also wrote this.
# If selected GPU vendor is NVIDIA and selected GPU is used on an ARM system.
if self.device_vendor_id in ["v000010DE", "Nvidia"] and gpu_device_path.startswith("/sys/devices/") == True:
this is enough to enter the if for both cases
however the gpu_device_path
does not end with a /
so there are a lot of errors
anyway, just fix this as well to look like this:
# Try to get GPU list from "/sys/devices/" folder which is used by some ARM systems with NVIDIA GPU.
for file in os.listdir("/sys/devices/"):
if file.split(".")[0] == "gpu":
self.gpu_list.append(file)
self.gpu_device_path_list.append("/sys/devices/" + file + "/")
and then all is good
I'll just send the edited Gpu.py file just so its all clear. There are some extra print statements for logging, so you will see that if you diff with your branch. Gpu.txt
Is average required for Tegra GPUs?
Is average required for Tegra GPUs?
tegrastats uses the load number directly with no average. it won't hurt to have one, but that is up to you.
Are there any gpu monitoring tools installed with GPU driver for AMD cards (on your system)?
no gpu monitoring tools come with the AMD mesa gpu driver.
I installed radeontop
separately (thats a project not affiliated with mesa). radeon profile
is also a separate project which I installed additionally.
I will update the code tomorrow. GPU load for AMD GPUs will be fixed. Also line numbers in the frequency information will be removed and GPU power information will be shown.
Can you run this Python file and compare the values with the values from radeontop? gpu_load_amd.txt
It reads the load file 50 times in a second and calculates average of the values.
You can increase the number of cycles (range(50))
.
its the same how you have written it...
in my testing, on my system, the 50 loop completes in 0.004618356 seconds which is way to fast to get accurate readings.
i did some while true; do cat /sys/class/drm/card1/device/gpu_busy_percent; echo $EPOCHREALTIME; done
and found that the values complete a cycle between 0 and 100 about once every 16 milliseconds.... I should have expected this. the gpu load is in step with the frequency which the game runs at (60hz, which is 16 milliseconds). so if you sample the gpu load as much as you can over the whole 16 milliseconds, you will get a very accurate reading.
"a good enough solution" for this usecase to simple sleep inbetween each reading and sample across the whole cycle period. gpu_load_amd.txt
if you want to be compatible with higher refresh rate games/higher frequency loading. you will need to lower the sleep time and increase the number of samples.
Several bugs have been fixed.
Can you share a screenshot of the GPU tab for the AMD GPU?
A new package (v1.12.1) will be installable today if there is no problem.
About GPU load: Number of cycles is set as 370. Because there some monitors which have 360 Hz refresh rate. This is a quick fix. There may be a more detailed solution.
I do not know if the 60 Hz value which you wrote is screen refresh rate or refresh rate of the content (game, etc.). There may be a 165 Hz monitor but a game may have a 60 FPS limit. Game may be run fullscreen or not. I do not know if this affects the load writing (gpu_busy_percent) frequency.
@hakandundar34coding actually gpu load, temperature, and refresh rate have disappeared on my amd card now
and yes, I have checked, the only difference is the sensor moved to hwmon4
from the hwmon2
last time
grep . /sys/class/drm/card1/device/hwmon/hwmon4/*
grep: /sys/class/drm/card1/device/hwmon/hwmon4/device: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_enable:1
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_input:748
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_max:4500
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_min:0
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_target:748
/sys/class/drm/card1/device/hwmon/hwmon4/freq1_input:1233000000
/sys/class/drm/card1/device/hwmon/hwmon4/freq1_label:sclk
/sys/class/drm/card1/device/hwmon/hwmon4/freq2_input:300000000
/sys/class/drm/card1/device/hwmon/hwmon4/freq2_label:mclk
/sys/class/drm/card1/device/hwmon/hwmon4/in0_input:1018
/sys/class/drm/card1/device/hwmon/hwmon4/in0_label:vddgfx
/sys/class/drm/card1/device/hwmon/hwmon4/name:amdgpu
grep: /sys/class/drm/card1/device/hwmon/hwmon4/power: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/power1_average:27068000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_default:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_max:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_min:0
/sys/class/drm/card1/device/hwmon/hwmon4/power1_label:slowPPT
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1:51
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_enable:1
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_max:255
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_min:0
grep: /sys/class/drm/card1/device/hwmon/hwmon4/subsystem: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_crit:94000
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_crit_hyst:-273150
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_input:50000
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_label:edge
also gpu_busy_percent is still there... so I have no clue why your gui is not working anymore
/sys/class/drm/card1/device/gpu_busy_percent
I double checked, previous version 1.12.0 has working temperature, refresh rate, and gpu usage (except for not averaged)
actually refresh rate still works, it just doesn't work when using dual monitors in wayland
found the temperature bug: you recently added this line, it does not show if you have this
that else happens after a for loop... pretty sure that is not valid python syntax
I added some prints, and its not starting the amd_gpu_load_func
try:
self.amd_gpu_load_func()
gpu_load = f'{(sum(self.amd_gpu_load_list) / len(self.amd_gpu_load_list)):.0f} %'
print(gpu_load)
except Exception:
gpu_load = "-"
print(gpu_load)
-
gets printed to terminal each second... so the exception is called
alright, commented out the try, here is the bug
File "/usr/share/system-monitoring-center/src/Gpu.py", line 366, in gpu_load_memory_frequency_power_func
self.amd_gpu_load_func()
File "/usr/share/system-monitoring-center/src/Gpu.py", line 548, in amd_gpu_load_func
with open(gpu_device_path + "device/gpu_busy_percent") as reader:
NameError: name 'gpu_device_path' is not defined
you forgot to add
selected_gpu_number = self.selected_gpu_number
selected_gpu = self.gpu_list[selected_gpu_number]
gpu_device_path = self.gpu_device_path_list[selected_gpu_number]
gpu_device_sub_path = self.gpu_device_sub_path_list[selected_gpu_number]
to the amd_gpu_load_func
even with these changes... I now only get 0% load
if I add a print into that function, it does return values of gpu load non-zero, so now the averaging part must be wrong
# Read file to get GPU load information. This information is calculated for a very small time (screen refresh rate or content (game, etc.) refresh rate?) and directly plotting this data gives spikes.
with open(gpu_device_path + "device/gpu_busy_percent") as reader:
gpu_load = reader.read().strip()
print(gpu_load)
0
56
8
0
0
0
0
18
89
oh, lol easy fix, why are you dividing by 1000?
self.amd_gpu_load_list.append(float(gpu_load)/1000)
should just be
self.amd_gpu_load_list.append(float(gpu_load))
with all my above fixes implemented, now we have a working GUI
copying the relevant info over to this independent issue https://github.com/hakandundar34coding/system-monitoring-center/issues/40#issuecomment-1081259315:
this is a sandisk card, decoding the ID should come back to that.