hakandundar34coding / system-monitoring-center

Multi-featured system monitor
GNU General Public License v3.0
904 stars 84 forks source link

Improvements for emmc/sd devices, GPU tab and .deb packaging #45

Closed theofficialgman closed 2 years ago

theofficialgman commented 2 years ago

copying the relevant info over to this independent issue https://github.com/hakandundar34coding/system-monitoring-center/issues/40#issuecomment-1081259315:

Vendor information can be get for MMC devices by using some files. Can you write output of this command: grep . /sys/class/block/mmcblk0/device/*

``` grep . /sys/class/block/mmcblk0/device/* grep: /sys/class/block/mmcblk0/device/block: Is a directory /sys/class/block/mmcblk0/device/cid:035344534e35313280fff7b17b015700 /sys/class/block/mmcblk0/device/csd:400e0032db79000ee5b77f800a404000 /sys/class/block/mmcblk0/device/date:07/2021 grep: /sys/class/block/mmcblk0/device/driver: Is a directory /sys/class/block/mmcblk0/device/dsr:0x404 /sys/class/block/mmcblk0/device/erase_size:512 /sys/class/block/mmcblk0/device/error_stats:0 /sys/class/block/mmcblk0/device/fwrev:0x0 /sys/class/block/mmcblk0/device/hwrev:0x8 /sys/class/block/mmcblk0/device/ios_timing:timing spec: 6 (sd uhs SDR104) /sys/class/block/mmcblk0/device/manfid:0x000003 /sys/class/block/mmcblk0/device/name:SN512 /sys/class/block/mmcblk0/device/ocr:0x00200000 /sys/class/block/mmcblk0/device/oemid:0x5344 grep: /sys/class/block/mmcblk0/device/power: Is a directory /sys/class/block/mmcblk0/device/preferred_erase_size:4194304 /sys/class/block/mmcblk0/device/scr:0245848700000000 /sys/class/block/mmcblk0/device/serial:0xfff7b17b /sys/class/block/mmcblk0/device/speed_class:4 /sys/class/block/mmcblk0/device/ssr:0000000008000000040090000f05391e000800000002fc0003000000000000000000000000000000000000000000000000000000000000000000000000000000 grep: /sys/class/block/mmcblk0/device/subsystem: Is a directory /sys/class/block/mmcblk0/device/type:SD /sys/class/block/mmcblk0/device/uevent:DRIVER=mmcblk /sys/class/block/mmcblk0/device/uevent:MMC_TYPE=SD /sys/class/block/mmcblk0/device/uevent:MMC_NAME=SN512 /sys/class/block/mmcblk0/device/uevent:MODALIAS=mmc:block ```

this is a sandisk card, decoding the ID should come back to that.

hakandundar34coding commented 2 years ago

Can you try the latest source code in draft branch and share a screenshot if there is no error?


udev may be used for detecting device information but I do not know which details can be get on ARM systems.

Can you share output of these commands?

udevadm info /dev/mmcblk0
udevadm info /sys/class/net/enx0050b62942ef
udevadm info /sys/class/net/usb0
udevadm info /sys/devices/card.0
udevadm info /sys/devices/card.0 --attribute-walk
theofficialgman commented 2 years ago

Screenshot from 2022-04-04 16-54-05

I think everything works fine. do you still need the output from the commands?

btw, I really did prefer the deb releases of this software (since they fully handle the dependencies needed). especially since you now have to open the software once via the command line before the .desktop file is generated... I'd like to include this as part of pi-apps but thats a blocking issue. the .desktop file needs be generated immediately on installation

theofficialgman commented 2 years ago
udevadm info /dev/mmcblk0
P: /devices/sdhci-tegra.0/mmc_host/mmc0/mmc0:aaaa/block/mmcblk0
N: mmcblk0
S: disk/by-id/mmc-SN512_0xfff7bXXX
S: disk/by-path/platform-sdhci-tegra.0
E: DEVLINKS=/dev/disk/by-id/mmc-SN512_0xfff7bXXX /dev/disk/by-path/platform-sdhci-tegra.0
E: DEVNAME=/dev/mmcblk0
E: DEVPATH=/devices/sdhci-tegra.0/mmc_host/mmc0/mmc0:aaaa/block/mmcblk0
E: DEVTYPE=disk
E: ID_DRIVE_FLASH_SD=1
E: ID_DRIVE_MEDIA_FLASH_SD=1
E: ID_NAME=SN512
E: ID_PART_TABLE_TYPE=gpt
E: ID_PART_TABLE_UUID=109b0c2b-5858-9090-8081-82831011XXXX
E: ID_PATH=platform-sdhci-tegra.0
E: ID_PATH_TAG=platform-sdhci-tegra_0
E: ID_SERIAL=0xfff7b17b
E: MAJOR=179
E: MINOR=0
E: SUBSYSTEM=block
E: TAGS=:systemd:
E: USEC_INITIALIZED=6400810

garrett@garrett-usb:~$ udevadm info /sys/class/net/enx0050b62942ef
P: /devices/70090000.xusb/usb1/1-1/1-1.3/1-1.3:1.0/net/enx0050b62942ef
E: DEVPATH=/devices/70090000.xusb/usb1/1-1/1-1.3/1-1.3:1.0/net/enx0050b62942ef
E: ID_BUS=usb
E: ID_MM_CANDIDATE=1
E: ID_MODEL=AX88179
E: ID_MODEL_ENC=AX88179
E: ID_MODEL_FROM_DATABASE=AX88179 Gigabit Ethernet
E: ID_MODEL_ID=1790
E: ID_NET_DRIVER=ax88179_178a
E: ID_NET_LINK_FILE=/lib/systemd/network/99-default.link
E: ID_NET_NAME_MAC=enx0050b62942ef
E: ID_OUI_FROM_DATABASE=GOOD WAY IND. CO., LTD.
E: ID_PATH=platform-70090000.xusb-usb-0:1.3:1.0
E: ID_PATH_TAG=platform-70090000_xusb-usb-0_1_3_1_0
E: ID_REVISION=0100
E: ID_SERIAL=ASIX_Elec._Corp._AX88179_000050B62942EF
E: ID_SERIAL_SHORT=000050B6294XXX
E: ID_TYPE=generic
E: ID_USB_CLASS_FROM_DATABASE=Vendor Specific Class
E: ID_USB_DRIVER=ax88179_178a
E: ID_USB_INTERFACES=:ffff00:
E: ID_USB_INTERFACE_NUM=00
E: ID_USB_SUBCLASS_FROM_DATABASE=Vendor Specific Subclass
E: ID_VENDOR=ASIX_Elec._Corp.
E: ID_VENDOR_ENC=ASIX\x20Elec.\x20Corp.
E: ID_VENDOR_FROM_DATABASE=ASIX Electronics Corp.
E: ID_VENDOR_ID=0b95
E: IFINDEX=18
E: INTERFACE=enx0050b62942ef
E: SUBSYSTEM=net
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/enx0050b62942ef /sys/subsystem/net/devices/enx0050b62942ef
E: TAGS=:systemd:
E: USEC_INITIALIZED=27614956460

garrett@garrett-usb:~$ udevadm info /sys/class/net/usb0
P: /devices/700d0000.xudc/gadget/net/usb0
E: DEVPATH=/devices/700d0000.xudc/gadget/net/usb0
E: DEVTYPE=gadget
E: ID_MM_CANDIDATE=1
E: ID_NET_DRIVER=g_ether
E: ID_NET_LINK_FILE=/lib/systemd/network/99-default.link
E: ID_PATH=platform-700d0000.xudc
E: ID_PATH_TAG=platform-700d0000_xudc
E: IFINDEX=8
E: INTERFACE=usb0
E: NM_UNMANAGED=1
E: SUBSYSTEM=net
E: SYSTEMD_ALIAS=/sys/subsystem/net/devices/usb0
E: TAGS=:systemd:
E: USEC_INITIALIZED=7531640

garrett@garrett-usb:~$ udevadm info /sys/devices/card.0
Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
garrett@garrett-usb:~$ udevadm info /sys/devices/card.0 --attribute-walk
Unknown device, absolute path in /dev/ or /sys expected.

I replaced a few numbers with XXX to hide any of my actual serial numbers/unique IDs

hakandundar34coding commented 2 years ago

Vendor-model information bug for MMC devices is fixed.

The issue will be closed.

About deb packages: A debian branch is planned (in a week) for some application stores. But deb files will not be added. They can be generated by maintainers from the source code by using installation scripts. Is it useful for you? How the application addition/update process of the pi-apps work?

Update: pyudev will not be used for getting device venfor-model. Current method is faster.

Can you share file list of this folder if you have a NVMe or M2 SSD on your computer (not Switch)?: /sys/class/block/[parent_disk_name]/device/device/ Note: There are two device folders.

Is there any modalias file?

theofficialgman commented 2 years ago

About deb packages: A debian branch is planned (in a week) for some application stores. But deb files will not be added. They can be generated by maintainers from the source code by using installation scripts. Is it useful for you? How the application addition/update process of the pi-apps work?

The implementation used for a lot of apps at pi-apps is automatic obtaining of .deb or .tar.gz files from github releases of software. We have a automated github action script that checks for new releases for our apps from their github repos, this create a PR to pi-apps, which when merged goes out to users and prompts them to update their installed app. So this relies on a .deb already being available on the repo. This is the preferred method rather than just making an install script to build the application

it might be easier to just see via the original PR: https://github.com/Botspot/pi-apps/pull/1629/files the version variable was automatically updated by github actions when a new release was made via github releases on this repo. now that no github releases are made, its not as simple to update our script, which is how users get updates

theofficialgman commented 2 years ago

Can you share file list of this folder if you have a NVMe or M2 SSD on your computer (not Switch)?: /sys/class/block/[parent_disk_name]/device/device/ Note: There are two device folders.

Is there any modalias file?

ls -1 /sys/class/block/nvme0n1/device/device
aer_dev_correctable
aer_dev_fatal
aer_dev_nonfatal
ari_enabled
broken_parity_status
class
config
consistent_dma_mask_bits
current_link_speed
current_link_width
d3cold_allowed
device
dma_mask_bits
driver
driver_override
enable
firmware_node
irq
link
local_cpulist
local_cpus
max_link_speed
max_link_width
modalias
msi_bus
msi_irqs
numa_node
nvme
pools
power
power_state
remove
rescan
reset
reset_method
resource
resource0
revision
subsystem
subsystem_device
subsystem_vendor
uevent
vendor
cat /sys/class/block/nvme0n1/device/device/modalias 
pci:v0000144Dd0000A808sv0000144Dsd0000A801bc01sc08i02
hakandundar34coding commented 2 years ago

I will try to prepare deb package for the newer releases. There are other python packages which are packaged for debian based systems. Their project structure (files/folders) are similar to structure of this application. I will try to learn how this conversion (python package structure to deb package structure) is made for every deb package.


I have updated the vendor-model detection codes for all devices. This will be used in newer versions. It is more reliable and more devices are supported now and maintenance is easier. Also a lot of ARM CPUs are supported. hwdata dependency is removed. After code changes for ARM devices/USB network cards some device model detection codes did not work (for example: some devices on QEMU virtual machines). This is fixed.

I tested the code for several devices/virtual machines. Can you make a quick test if there are bugs for device vendor/models only (on CPU, Disk, Network, GPU tabs)? https://github.com/hakandundar34coding/system-monitoring-center/archive/refs/heads/master.zip


Currently there is only mesa-utils dependency which is not a Python package. The other dependencies are already installed on many devices (such as dmidecode, iproute2, util-linux, etc.). This dependency may be removed if there is a way to get graphics card memory without it. Also GPU frequency, load etc. will be added.

A separate issue may be used for this features if you want to provide additional information. You may be tired because of this bug/device information/feature traffic.

theofficialgman commented 2 years ago

I have updated the vendor-model detection codes for all devices. This will be used in newer versions. It is more reliable and more devices are supported now and maintenance is easier. Also a lot of ARM CPUs are supported. hwdata dependency is removed. After code changes for ARM devices/USB network cards some device model detection codes did not work (for example: some devices on QEMU virtual machines). This is fixed.

my network card name and info no longer show up

Traceback (most recent call last):
  File "/home/garrett/.local/lib/python3.6/site-packages/systemmonitoringcenter/src/Network.py", line 169, in network_initial_func
    device_vendor_name, device_model_name, _, _ = Performance.performance_get_device_vendor_model_func(modalias_output)
  File "/home/garrett/.local/lib/python3.6/site-packages/systemmonitoringcenter/src/Performance.py", line 396, in performance_get_device_vendor_model_func
    with open(udev_hardware_database_dir + "20-usb-vendor-model.hwdb") as reader:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb'

I tested the code for several devices/virtual machines. Can you make a quick test if there are bugs for device vendor/models only (on CPU, Disk, Network, GPU tabs)? https://github.com/hakandundar34coding/system-monitoring-center/archive/refs/heads/master.zip

CPU, Disk, and GPU all still seem fine

theofficialgman commented 2 years ago

this file exists but is located at a different directory on my system /lib/udev/hwdb.d/20-usb-vendor-model.hwdb. all ubuntu and debian distros use this location same goes for the pcie model file

if I change the udev_hardware_database_dir variable location all works fine. if your system has a symlink from /lib/udev/hwdb.d to /usr/lib/udev/hwdb.d/ I would just consider using the first one. or just check for both directories like you have done before

hakandundar34coding commented 2 years ago

On Ubuntu 18 the directory is /lib/udev/hwdb.d/ but on several newer Linux distributions and Ubuntu 21 it is /usr/lib/udev/hwdb.d/.

The bug is fixed, the code is updated and tested on Ubuntu 18 on virtual machine.

theofficialgman commented 2 years ago

On Ubuntu 18 the directory is /lib/udev/hwdb.d/ but on several newer Linux distributions and Ubuntu 21 it is /usr/lib/udev/hwdb.d/.

The bug is fixed, the code is updated and tested on Ubuntu 18 on virtual machine.

this is not correct. it is still /lib/udev/hwdb.d/ on newer ubuntu. what is different is on newer ubuntu there is an additional symlink from /lib to /usr/lib you can see in the jammy files list for the udev package (where the 20-usb-vendor-model.hwdb file comes from), the the location is still /lib/udev/hwdb.d/20-usb-vendor-model.hwdb https://packages.ubuntu.com/jammy/amd64/udev/filelist

I'm not aware of any distros (yet) that exclusively use /usr/lib/udev/hwdb.d/20-usb-vendor-model.hwdb with the /lib location

your solution is fine though

hakandundar34coding commented 2 years ago

this is not correct.

Here is an explanation for the change and about the folder link for compability: https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge/


I tried to prepare a deb package but there are problems. Do you know preparing deb packages for Python packages?

theofficialgman commented 2 years ago

I tried to prepare a deb package but there are problems. Do you know preparing deb packages for Python packages?

Is there something wrong with the previous method you used for packaging the deb? I've never made one before so no

hakandundar34coding commented 2 years ago

Project structure is changed for pip installation. There are some python packages which are packaged as deb packages by package maintainers. Some debian files are used for deb packaging without project structure modifications.

A separate branch may be prepared by using the old method but code updates will be applied to this branch manually. This method may be used as a temporary solution.

hakandundar34coding commented 2 years ago

A .deb package is prepared by using the source code on the deb_for_stores branch and it can be downloaded. There are minor modifications on the source code, setup.py file and number of files for deb packaging. The structure in this branch is very similar to the one which was used before Python package structure.

A link for the releases page: https://github.com/hakandundar34coding/system-monitoring-center/releases

theofficialgman commented 2 years ago

shouldn't you be adding back hwdata as a dependency as well for the debian package?

hakandundar34coding commented 2 years ago

Same code and database is used for the deb package version. Code changes are for checking some files (shortcut and GUI image links) and starting the application. hwdata is not used for Python package and deb package versions of the application.

hakandundar34coding commented 2 years ago

I made code changes for GPU tab (load/frequency, etc.). They are not uploaded yet. But nvidia-smi is not very fast for getting GPU information very frequently (<= 1 second).

nvidia-smi can be used on desktop systems without installing an additional software if closed sourced drivers are installed (tested on a 12 year old GPU and a 1-2 year old GPU).

I do not know what is the elapsed time for getting this information on your N.Switch device.

#!/usr/bin/env python3

import subprocess, time
start_time=time.time()
gpu_tool_command = ["nvidia-smi", "--query-gpu=gpu_name,gpu_bus_id,driver_version,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used,temperature.gpu,clocks.current.graphics,clocks.max.graphics", "--format=csv"]
gpu_tool_output = (subprocess.check_output(gpu_tool_command, shell=False)).decode().strip().split("\n")
end_time=time.time()
print(end_time-start_time)
print(gpu_tool_output)

GPU information can be get for systems with Intel and Nvidia. But there is no a system with AMD GPU to test GPU information scripts.

theofficialgman commented 2 years ago

Unfortunately nvidia-smi is only available on nvidia desktop gpus, not on the Nvidia Jetson/tegra platforms (which the Nintendo switch is one of).

I have a system with AMD and Intel graphics so I can test those for you.

hakandundar34coding commented 2 years ago

There will be solutions for the N.Switch. nvidia-smi is used for desktop GPUs.

GPU tab is redesigned. mesa-utils dependency is removed and glarea widget is removed (code is not uploaded yet). There was some problems on some devices (for example some RB-Pi devices) when GPU tab is opened.

Can you share output of these commands on your system (AMD GPU)? You can change card0 to card1 if AMD gpu has different name.

grep . /sys/class/drm/card0/*
grep . /sys/class/drm/card0/device/*

You can also share the outputs of the same commands for the Intel GPU. Intel GPU on my system is very old and very limited performance information is provided in the gpu folder.

theofficialgman commented 2 years ago

my intel and amd gpu's are on the same system. so intel is card0 and amd is card1

intel:

``` grep . /sys/class/drm/card0/* grep: /sys/class/drm/card0/card0-DP-3: Is a directory grep: /sys/class/drm/card0/card0-DP-4: Is a directory grep: /sys/class/drm/card0/card0-HDMI-A-1: Is a directory grep: /sys/class/drm/card0/card0-HDMI-A-2: Is a directory grep: /sys/class/drm/card0/card0-HDMI-A-3: Is a directory /sys/class/drm/card0/dev:226:0 grep: /sys/class/drm/card0/device: Is a directory grep: /sys/class/drm/card0/engine: Is a directory grep: /sys/class/drm/card0/error: Permission denied /sys/class/drm/card0/gt_act_freq_mhz:350 /sys/class/drm/card0/gt_boost_freq_mhz:1150 /sys/class/drm/card0/gt_cur_freq_mhz:350 /sys/class/drm/card0/gt_max_freq_mhz:1150 /sys/class/drm/card0/gt_min_freq_mhz:350 /sys/class/drm/card0/gt_RP0_freq_mhz:1150 /sys/class/drm/card0/gt_RP1_freq_mhz:350 /sys/class/drm/card0/gt_RPn_freq_mhz:350 grep: /sys/class/drm/card0/metrics: Is a directory grep: /sys/class/drm/card0/power: Is a directory grep: /sys/class/drm/card0/subsystem: Is a directory /sys/class/drm/card0/uevent:MAJOR=226 /sys/class/drm/card0/uevent:MINOR=0 /sys/class/drm/card0/uevent:DEVNAME=dri/card0 /sys/class/drm/card0/uevent:DEVTYPE=drm_minor user@user:~$ grep . /sys/class/drm/card0/device/* /sys/class/drm/card0/device/ari_enabled:0 /sys/class/drm/card0/device/broken_parity_status:0 /sys/class/drm/card0/device/class:0x038000 grep: /sys/class/drm/card0/device/config: binary file matches /sys/class/drm/card0/device/consistent_dma_mask_bits:39 grep: /sys/class/drm/card0/device/consumer:pci:0000:00:1f.3: Is a directory /sys/class/drm/card0/device/current_link_speed:Unknown /sys/class/drm/card0/device/current_link_width:0 /sys/class/drm/card0/device/d3cold_allowed:1 /sys/class/drm/card0/device/device:0x3e92 /sys/class/drm/card0/device/dma_mask_bits:39 grep: /sys/class/drm/card0/device/driver: Is a directory /sys/class/drm/card0/device/driver_override:(null) grep: /sys/class/drm/card0/device/drm: Is a directory /sys/class/drm/card0/device/enable:1 grep: /sys/class/drm/card0/device/firmware_node: Is a directory grep: /sys/class/drm/card0/device/graphics: Is a directory grep: /sys/class/drm/card0/device/i2c-1: Is a directory grep: /sys/class/drm/card0/device/i2c-2: Is a directory grep: /sys/class/drm/card0/device/i2c-3: Is a directory /sys/class/drm/card0/device/index:1 /sys/class/drm/card0/device/irq:136 /sys/class/drm/card0/device/label:Onboard - Video grep: /sys/class/drm/card0/device/link: Is a directory /sys/class/drm/card0/device/local_cpulist:0-5 /sys/class/drm/card0/device/local_cpus:3f /sys/class/drm/card0/device/max_link_speed:Unknown /sys/class/drm/card0/device/max_link_width:255 /sys/class/drm/card0/device/modalias:pci:v00008086d00003E92sv00001462sd00007B48bc03sc80i00 /sys/class/drm/card0/device/msi_bus:1 grep: /sys/class/drm/card0/device/msi_irqs: Is a directory /sys/class/drm/card0/device/numa_node:-1 grep: /sys/class/drm/card0/device/power: Is a directory /sys/class/drm/card0/device/power_state:D0 grep: /sys/class/drm/card0/device/remove: Permission denied grep: /sys/class/drm/card0/device/rescan: Permission denied grep: /sys/class/drm/card0/device/reset: Permission denied /sys/class/drm/card0/device/reset_method:flr pm /sys/class/drm/card0/device/resource:0x00000000de000000 0x00000000deffffff 0x0000000000140204 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x00000000b0000000 0x00000000bfffffff 0x000000000014220c /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x000000000000f000 0x000000000000f03f 0x0000000000040101 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card0/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 grep: /sys/class/drm/card0/device/resource0: Permission denied grep: /sys/class/drm/card0/device/resource2: Permission denied grep: /sys/class/drm/card0/device/resource2_wc: Permission denied grep: /sys/class/drm/card0/device/resource4: Permission denied /sys/class/drm/card0/device/revision:0x00 grep: /sys/class/drm/card0/device/subsystem: Is a directory /sys/class/drm/card0/device/subsystem_device:0x7b48 /sys/class/drm/card0/device/subsystem_vendor:0x1462 /sys/class/drm/card0/device/uevent:DRIVER=i915 /sys/class/drm/card0/device/uevent:PCI_CLASS=38000 /sys/class/drm/card0/device/uevent:PCI_ID=8086:3E92 /sys/class/drm/card0/device/uevent:PCI_SUBSYS_ID=1462:7B48 /sys/class/drm/card0/device/uevent:PCI_SLOT_NAME=0000:00:02.0 /sys/class/drm/card0/device/uevent:MODALIAS=pci:v00008086d00003E92sv00001462sd00007B48bc03sc80i00 /sys/class/drm/card0/device/vendor:0x8086 ```

amd:

``` grep . /sys/class/drm/card1/* grep: /sys/class/drm/card1/card1-DP-1: Is a directory grep: /sys/class/drm/card1/card1-DP-2: Is a directory grep: /sys/class/drm/card1/card1-DP-5: Is a directory grep: /sys/class/drm/card1/card1-DVI-D-1: Is a directory grep: /sys/class/drm/card1/card1-HDMI-A-4: Is a directory /sys/class/drm/card1/dev:226:1 grep: /sys/class/drm/card1/device: Is a directory grep: /sys/class/drm/card1/power: Is a directory grep: /sys/class/drm/card1/subsystem: Is a directory /sys/class/drm/card1/uevent:MAJOR=226 /sys/class/drm/card1/uevent:MINOR=1 /sys/class/drm/card1/uevent:DEVNAME=dri/card1 /sys/class/drm/card1/uevent:DEVTYPE=drm_minor user@user:~$ grep . /sys/class/drm/card1/device/* /sys/class/drm/card1/device/aer_dev_correctable:RxErr 0 /sys/class/drm/card1/device/aer_dev_correctable:BadTLP 0 /sys/class/drm/card1/device/aer_dev_correctable:BadDLLP 0 /sys/class/drm/card1/device/aer_dev_correctable:Rollover 0 /sys/class/drm/card1/device/aer_dev_correctable:Timeout 0 /sys/class/drm/card1/device/aer_dev_correctable:NonFatalErr 0 /sys/class/drm/card1/device/aer_dev_correctable:CorrIntErr 0 /sys/class/drm/card1/device/aer_dev_correctable:HeaderOF 0 /sys/class/drm/card1/device/aer_dev_correctable:TOTAL_ERR_COR 0 /sys/class/drm/card1/device/aer_dev_fatal:Undefined 0 /sys/class/drm/card1/device/aer_dev_fatal:DLP 0 /sys/class/drm/card1/device/aer_dev_fatal:SDES 0 /sys/class/drm/card1/device/aer_dev_fatal:TLP 0 /sys/class/drm/card1/device/aer_dev_fatal:FCP 0 /sys/class/drm/card1/device/aer_dev_fatal:CmpltTO 0 /sys/class/drm/card1/device/aer_dev_fatal:CmpltAbrt 0 /sys/class/drm/card1/device/aer_dev_fatal:UnxCmplt 0 /sys/class/drm/card1/device/aer_dev_fatal:RxOF 0 /sys/class/drm/card1/device/aer_dev_fatal:MalfTLP 0 /sys/class/drm/card1/device/aer_dev_fatal:ECRC 0 /sys/class/drm/card1/device/aer_dev_fatal:UnsupReq 0 /sys/class/drm/card1/device/aer_dev_fatal:ACSViol 0 /sys/class/drm/card1/device/aer_dev_fatal:UncorrIntErr 0 /sys/class/drm/card1/device/aer_dev_fatal:BlockedTLP 0 /sys/class/drm/card1/device/aer_dev_fatal:AtomicOpBlocked 0 /sys/class/drm/card1/device/aer_dev_fatal:TLPBlockedErr 0 /sys/class/drm/card1/device/aer_dev_fatal:PoisonTLPBlocked 0 /sys/class/drm/card1/device/aer_dev_fatal:TOTAL_ERR_FATAL 0 /sys/class/drm/card1/device/aer_dev_nonfatal:Undefined 0 /sys/class/drm/card1/device/aer_dev_nonfatal:DLP 0 /sys/class/drm/card1/device/aer_dev_nonfatal:SDES 0 /sys/class/drm/card1/device/aer_dev_nonfatal:TLP 0 /sys/class/drm/card1/device/aer_dev_nonfatal:FCP 0 /sys/class/drm/card1/device/aer_dev_nonfatal:CmpltTO 0 /sys/class/drm/card1/device/aer_dev_nonfatal:CmpltAbrt 0 /sys/class/drm/card1/device/aer_dev_nonfatal:UnxCmplt 0 /sys/class/drm/card1/device/aer_dev_nonfatal:RxOF 0 /sys/class/drm/card1/device/aer_dev_nonfatal:MalfTLP 0 /sys/class/drm/card1/device/aer_dev_nonfatal:ECRC 0 /sys/class/drm/card1/device/aer_dev_nonfatal:UnsupReq 0 /sys/class/drm/card1/device/aer_dev_nonfatal:ACSViol 0 /sys/class/drm/card1/device/aer_dev_nonfatal:UncorrIntErr 0 /sys/class/drm/card1/device/aer_dev_nonfatal:BlockedTLP 0 /sys/class/drm/card1/device/aer_dev_nonfatal:AtomicOpBlocked 0 /sys/class/drm/card1/device/aer_dev_nonfatal:TLPBlockedErr 0 /sys/class/drm/card1/device/aer_dev_nonfatal:PoisonTLPBlocked 0 /sys/class/drm/card1/device/aer_dev_nonfatal:TOTAL_ERR_NONFATAL 0 /sys/class/drm/card1/device/ari_enabled:0 /sys/class/drm/card1/device/boot_vga:1 /sys/class/drm/card1/device/broken_parity_status:0 /sys/class/drm/card1/device/class:0x030000 grep: /sys/class/drm/card1/device/config: binary file matches /sys/class/drm/card1/device/consistent_dma_mask_bits:40 grep: /sys/class/drm/card1/device/consumer:pci:0000:01:00.1: Is a directory /sys/class/drm/card1/device/current_link_speed:8.0 GT/s PCIe /sys/class/drm/card1/device/current_link_width:16 /sys/class/drm/card1/device/d3cold_allowed:1 /sys/class/drm/card1/device/device:0x67df /sys/class/drm/card1/device/dma_mask_bits:40 grep: /sys/class/drm/card1/device/driver: Is a directory /sys/class/drm/card1/device/driver_override:(null) grep: /sys/class/drm/card1/device/drm: Is a directory /sys/class/drm/card1/device/enable:1 grep: /sys/class/drm/card1/device/firmware_node: Is a directory grep: /sys/class/drm/card1/device/fw_version: Is a directory /sys/class/drm/card1/device/gpu_busy_percent:0 grep: /sys/class/drm/card1/device/graphics: Is a directory grep: /sys/class/drm/card1/device/hwmon: Is a directory grep: /sys/class/drm/card1/device/i2c-10: Is a directory grep: /sys/class/drm/card1/device/i2c-4: Is a directory grep: /sys/class/drm/card1/device/i2c-5: Is a directory grep: /sys/class/drm/card1/device/i2c-8: Is a directory grep: /sys/class/drm/card1/device/i2c-9: Is a directory /sys/class/drm/card1/device/irq:139 grep: /sys/class/drm/card1/device/link: Is a directory /sys/class/drm/card1/device/local_cpulist:0-5 /sys/class/drm/card1/device/local_cpus:3f /sys/class/drm/card1/device/max_link_speed:8.0 GT/s PCIe /sys/class/drm/card1/device/max_link_width:16 /sys/class/drm/card1/device/mem_busy_percent:6 /sys/class/drm/card1/device/mem_info_gtt_total:8589934592 /sys/class/drm/card1/device/mem_info_gtt_used:112996352 /sys/class/drm/card1/device/mem_info_preempt_used:0 /sys/class/drm/card1/device/mem_info_vis_vram_total:268435456 /sys/class/drm/card1/device/mem_info_vis_vram_used:61882368 /sys/class/drm/card1/device/mem_info_vram_total:8589934592 /sys/class/drm/card1/device/mem_info_vram_used:1069596672 /sys/class/drm/card1/device/mem_info_vram_vendor:unknown /sys/class/drm/card1/device/modalias:pci:v00001002d000067DFsv00001462sd0000341Bbc03sc00i00 /sys/class/drm/card1/device/msi_bus:1 grep: /sys/class/drm/card1/device/msi_irqs: Is a directory /sys/class/drm/card1/device/numa_node:-1 /sys/class/drm/card1/device/pcie_bw:603 61 256 /sys/class/drm/card1/device/pcie_replay_count:0 grep: /sys/class/drm/card1/device/power: Is a directory /sys/class/drm/card1/device/power_dpm_force_performance_level:auto /sys/class/drm/card1/device/power_dpm_state:performance /sys/class/drm/card1/device/power_state:D0 /sys/class/drm/card1/device/pp_cur_state:1 /sys/class/drm/card1/device/pp_dpm_mclk:0: 300Mhz * /sys/class/drm/card1/device/pp_dpm_mclk:1: 1000Mhz /sys/class/drm/card1/device/pp_dpm_mclk:2: 1750Mhz /sys/class/drm/card1/device/pp_dpm_pcie:0: 2.5GT/s, x8 /sys/class/drm/card1/device/pp_dpm_pcie:1: 8.0GT/s, x16 * /sys/class/drm/card1/device/pp_dpm_sclk:0: 300Mhz /sys/class/drm/card1/device/pp_dpm_sclk:1: 588Mhz /sys/class/drm/card1/device/pp_dpm_sclk:2: 976Mhz /sys/class/drm/card1/device/pp_dpm_sclk:3: 1065Mhz /sys/class/drm/card1/device/pp_dpm_sclk:4: 1130Mhz /sys/class/drm/card1/device/pp_dpm_sclk:5: 1192Mhz /sys/class/drm/card1/device/pp_dpm_sclk:6: 1233Mhz * /sys/class/drm/card1/device/pp_dpm_sclk:7: 1268Mhz /sys/class/drm/card1/device/pp_mclk_od:0 /sys/class/drm/card1/device/pp_num_states:states: 2 /sys/class/drm/card1/device/pp_num_states:0 boot /sys/class/drm/card1/device/pp_num_states:1 performance /sys/class/drm/card1/device/pp_power_profile_mode:NUM MODE_NAME SCLK_UP_HYST SCLK_DOWN_HYST SCLK_ACTIVE_LEVEL MCLK_UP_HYST MCLK_DOWN_HYST MCLK_ACTIVE_LEVEL /sys/class/drm/card1/device/pp_power_profile_mode: 0 BOOTUP_DEFAULT: - - - - - - /sys/class/drm/card1/device/pp_power_profile_mode: 1 3D_FULL_SCREEN *: 0 100 30 10 60 25 /sys/class/drm/card1/device/pp_power_profile_mode: 2 POWER_SAVING: 10 0 30 - - - /sys/class/drm/card1/device/pp_power_profile_mode: 3 VIDEO: - - - 10 16 31 /sys/class/drm/card1/device/pp_power_profile_mode: 4 VR: 0 11 50 0 100 10 /sys/class/drm/card1/device/pp_power_profile_mode: 5 COMPUTE: 0 5 30 - - - /sys/class/drm/card1/device/pp_power_profile_mode: 6 CUSTOM: - - - - - - /sys/class/drm/card1/device/pp_sclk_od:0 grep: /sys/class/drm/card1/device/pp_table: binary file matches grep: /sys/class/drm/card1/device/remove: Permission denied grep: /sys/class/drm/card1/device/rescan: Permission denied grep: /sys/class/drm/card1/device/reset: Permission denied /sys/class/drm/card1/device/reset_method:bus /sys/class/drm/card1/device/resource:0x00000000c0000000 0x00000000cfffffff 0x000000000014220c /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x00000000d0000000 0x00000000d01fffff 0x000000000014220c /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x000000000000e000 0x000000000000e0ff 0x0000000000040101 /sys/class/drm/card1/device/resource:0x00000000df300000 0x00000000df33ffff 0x0000000000040200 /sys/class/drm/card1/device/resource:0x00000000000c0000 0x00000000000dffff 0x0000000000000212 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 /sys/class/drm/card1/device/resource:0x0000000000000000 0x0000000000000000 0x0000000000000000 grep: /sys/class/drm/card1/device/resource0: Permission denied grep: /sys/class/drm/card1/device/resource0_wc: Permission denied grep: /sys/class/drm/card1/device/resource2: Permission denied grep: /sys/class/drm/card1/device/resource2_wc: Permission denied grep: /sys/class/drm/card1/device/resource4: Permission denied grep: /sys/class/drm/card1/device/resource5: Permission denied /sys/class/drm/card1/device/revision:0xef grep: /sys/class/drm/card1/device/rom: Permission denied grep: /sys/class/drm/card1/device/subsystem: Is a directory /sys/class/drm/card1/device/subsystem_device:0x341b /sys/class/drm/card1/device/subsystem_vendor:0x1462 /sys/class/drm/card1/device/thermal_throttling_logging:0000:01:00.0: thermal throttling logging enabled, with interval 60 seconds /sys/class/drm/card1/device/uevent:DRIVER=amdgpu /sys/class/drm/card1/device/uevent:PCI_CLASS=30000 /sys/class/drm/card1/device/uevent:PCI_ID=1002:67DF /sys/class/drm/card1/device/uevent:PCI_SUBSYS_ID=1462:341B /sys/class/drm/card1/device/uevent:PCI_SLOT_NAME=0000:01:00.0 /sys/class/drm/card1/device/uevent:MODALIAS=pci:v00001002d000067DFsv00001462sd0000341Bbc03sc00i00 /sys/class/drm/card1/device/vbios_version:113V34122-F3 /sys/class/drm/card1/device/vendor:0x1002 ```
hakandundar34coding commented 2 years ago

You shared a lot of useful information and feedbacks. You can send a pull request if you want to add your name in a contributors file.

There is no a big difference between usage informations of Intel HD 3000 and your GPU.

A lot of information fields will be empty for Intel GPUs. There is a tool (intel_gpu_top) but root privileges are required for running it. Also installation of this tool is required. It will not be used for getting information.

Support for AMD GPU load, frequency, etc. is added. Source code is not uploaded yet. There will be improvements for temperature and power information.

It will not be used but are there any AMD GPU monitoring/information tools (like nvidia-smi) installed on your system by the driver? For example: rocm-smi.

Can you share ouput of therse commands? Currently some of them are not required but I do not want to disturb you again.

grep . /sys/class/drm/card0/metrics/*
grep . /sys/class/drm/card1/device/fw_version/*
grep . /sys/class/drm/card1/device/hwmon/*

and sub-folders for temperature and power. For example: `grep . /sys/class/drm/card1/device/hwmon/hwmon1/* There are temp... files and power... files for these sensors.

‐--------------

I know you shared file list of gpu.0 folder for N.Switch but this time file contents will be get:

grep . /sys/devices/gpu.0/*
grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
theofficialgman commented 2 years ago
grep . /sys/class/drm/card0/metrics/*
grep . /sys/class/drm/card1/device/fw_version/*
grep . /sys/class/drm/card1/device/hwmon/*

/sys/class/drm is an empty folder only containing one file version

grep . /sys/devices/gpu.0/*
grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
grep . /sys/devices/gpu.0/*
/sys/devices/gpu.0/aelpg_enable:1
/sys/devices/gpu.0/aelpg_param:1000000 100 10000 2000 200
/sys/devices/gpu.0/allow_all:0
/sys/devices/gpu.0/blcg_enable:1
/sys/devices/gpu.0/comptag_mem_deduct:0
/sys/devices/gpu.0/counters:38523066 311638150
/sys/devices/gpu.0/counters_reset:38523066 311648246
/sys/devices/gpu.0/czf_bypass:0
grep: /sys/devices/gpu.0/devfreq: Is a directory
grep: /sys/devices/gpu.0/driver: Is a directory
/sys/devices/gpu.0/driver_override:(null)
/sys/devices/gpu.0/elcg_enable:1
/sys/devices/gpu.0/elpg_enable:1
/sys/devices/gpu.0/emc3d_ratio:750
/sys/devices/gpu.0/enable_3d_scaling:1
/sys/devices/gpu.0/fmax_at_vmin_safe:460800000
/sys/devices/gpu.0/force_idle:0
/sys/devices/gpu.0/freq_request:921600000
/sys/devices/gpu.0/gfxp_wfi_timeout_count:0
/sys/devices/gpu.0/gfxp_wfi_timeout_unit:sysclk
/sys/devices/gpu.0/gpu_powered_on:1
grep: /sys/devices/gpu.0/iommu_group: Is a directory
/sys/devices/gpu.0/is_railgated:no
/sys/devices/gpu.0/ldiv_slowdown_factor:0
/sys/devices/gpu.0/load:0
/sys/devices/gpu.0/max_timeslice_us:50000
/sys/devices/gpu.0/min_timeslice_us:1000
/sys/devices/gpu.0/modalias:of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b
/sys/devices/gpu.0/mscg_enable:0
grep: /sys/devices/gpu.0/of_node: Is a directory
/sys/devices/gpu.0/pd_max_batches:0
grep: /sys/devices/gpu.0/power: Is a directory
/sys/devices/gpu.0/ptimer_ref_freq:31250000
/sys/devices/gpu.0/ptimer_scale_factor:1.644736
/sys/devices/gpu.0/ptimer_src_freq:19200000
/sys/devices/gpu.0/railgate_delay:500
/sys/devices/gpu.0/railgate_enable:1
/sys/devices/gpu.0/slcg_enable:1
grep: /sys/devices/gpu.0/subsystem: Is a directory
/sys/devices/gpu.0/tpc_fs_mask:0x3
/sys/devices/gpu.0/tpc_pg_mask:0
/sys/devices/gpu.0/uevent:DRIVER=gk20a
/sys/devices/gpu.0/uevent:OF_NAME=gpu
/sys/devices/gpu.0/uevent:OF_FULLNAME=/gpu
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_0=nvidia,tegra210-gm20b
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_1=nvidia,gm20b
/sys/devices/gpu.0/uevent:OF_COMPATIBLE_N=2
/sys/devices/gpu.0/uevent:MODALIAS=of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b
/sys/devices/gpu.0/user:0
grep . /sys/devices/gpu.0/devfreq/57000000.gpu/*
/sys/devices/gpu.0/devfreq/57000000.gpu/available_frequencies:76800000 153600000 230400000 307200000 384000000 460800000 537600000 614400000 691200000 768000000 844800000 921600000
/sys/devices/gpu.0/devfreq/57000000.gpu/available_governors:wmark_active nvhost_podgov userspace simple_ondemand
/sys/devices/gpu.0/devfreq/57000000.gpu/cur_freq:76800000
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/device: Is a directory
/sys/devices/gpu.0/devfreq/57000000.gpu/governor:nvhost_podgov
/sys/devices/gpu.0/devfreq/57000000.gpu/max_freq:768000000
/sys/devices/gpu.0/devfreq/57000000.gpu/min_freq:76800000
/sys/devices/gpu.0/devfreq/57000000.gpu/polling_interval:25
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/power: Is a directory
grep: /sys/devices/gpu.0/devfreq/57000000.gpu/subsystem: Is a directory
/sys/devices/gpu.0/devfreq/57000000.gpu/target_freq:76800000
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:     From  :   To
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:           :  76800000 153600000 230400000 307200000 384000000 460800000 537600000 614400000 691200000 768000000 844800000 921600000   time(ms)
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:*  76800000:         0        38         0         0       101         0         0         0         0         0         0         0    323769
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  153600000:        70         0        47         0         0         3         0         0         0         0         0         0    143883
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  230400000:        50        28         0        23         0         0         1         0         0         0         0         0    167245
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  307200000:        17         0        11         0         3         0         0         0         0         0         0         0     79597
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  384000000:         1        54        44         5         0         3         0         0         0         0         0         0     14115
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  460800000:         1         0         0         3         2         0         0         0         0         0         0         0      7672
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  537600000:         0         0         0         0         1         0         0         0         0         0         0         0        58
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  614400000:         0         0         0         0         0         0         0         0         0         0         0         0         0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  691200000:         0         0         0         0         0         0         0         0         0         0         0         0         0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  768000000:         0         0         0         0         0         0         0         0         0         0         0         0         0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  844800000:         0         0         0         0         0         0         0         0         0         0         0         0         0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:  921600000:         0         0         0         0         0         0         0         0         0         0         0         0         0
/sys/devices/gpu.0/devfreq/57000000.gpu/trans_stat:Total transition : 506

and before you ask, the power folder doesn't contain any useful info. there is no way to get power draw for this gpu.

and yes, load /sys/devices/gpu.0/load does scale from 0 to some value, I just am not doing anything right now. I'm not sure what the upper value is, its higher than 256. it might be a fraction of the current emc3d_ratio

theofficialgman commented 2 years ago

actually I think load is from 0 to 1000, 0 being 0%, 1000 being 100% so it just gives a decimal place

hakandundar34coding commented 2 years ago

I have updated the code for ARM and Tegra GPUs. GPU load will not be shown. 0 value instead of 1-2% when idle is very interesting. It may be processed instead of using it directly. There is a command sudo ~/tegrastats and it may be used for controlling the value in the file by using watch -n 1 cat /sys/devices/gpu.0/load

There are some comments about GPU information about Tegra devices.


The first group of commands (in drm folder) was for the AMD GPU. I remember the content of the drm folder for the ARM device.

I updated the comment and used dashed line to split it for AMD and Tegra GPUs.

I want some command outputs very frequently. You can prefer not writing if it is tiring/boring for you. :) The sensor information for AMD GPUs will be fixed later. A contributor may share this information.

theofficialgman commented 2 years ago

I have updated the code for ARM and Tegra GPUs. GPU load will not be shown. 0 value instead of 1-2% when idle is very interesting. It may be processed instead of using it directly. There is a command sudo ~/tegrastats and it may be used for controlling the value in the file by using watch -n 1 cat /sys/devices/gpu.0/load

There are some comments about GPU information about Tegra devices.

I'm already well aware of tegrastats and already cross referenced the values in tegrastats output compared to gpu.0/load, they are the same (well besides the 0 to 1000 scaling for load and 0% to 100% for tegrastats). GPU load of 0 is normal and expected because I was literally on a static image on the desktop, there is no load on the gpu in that scenario, its framebuffer is not changing, there is no animations, etc. For example, if tegrastats shows lets say 87% gpu load, the gpu.0/load value will be something like 872

You can prefer not writing if it is tiring/boring for you. :

nah don't worry its not tiring, I'm just very busy the next few weeks. I will get back to you later on the output of the drm folder of my amd gpu later.

theofficialgman commented 2 years ago

for amd gpu

garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/metrics/*
grep: /sys/class/drm/card1/metrics/*: No such file or directory
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/fw_version/*
/sys/class/drm/card1/device/fw_version/asd_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ce_fw_version:0x0000008c
/sys/class/drm/card1/device/fw_version/dmcu_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/mc_fw_version:0x03b4dc40
/sys/class/drm/card1/device/fw_version/mec2_fw_version:0x000002da
/sys/class/drm/card1/device/fw_version/mec_fw_version:0x000002da
/sys/class/drm/card1/device/fw_version/me_fw_version:0x000000a7
/sys/class/drm/card1/device/fw_version/pfp_fw_version:0x000000fe
/sys/class/drm/card1/device/fw_version/rlc_fw_version:0x0000011e
/sys/class/drm/card1/device/fw_version/rlc_srlc_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/rlc_srlg_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/rlc_srls_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/sdma2_fw_version:0x0000003a
/sys/class/drm/card1/device/fw_version/sdma_fw_version:0x0000003a
/sys/class/drm/card1/device/fw_version/smc_fw_version:0x00171100
/sys/class/drm/card1/device/fw_version/sos_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ta_ras_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/ta_xgmi_fw_version:0x00000000
/sys/class/drm/card1/device/fw_version/uvd_fw_version:0x01821000
/sys/class/drm/card1/device/fw_version/vce_fw_version:0x351a0300
/sys/class/drm/card1/device/fw_version/vcn_fw_version:0x00000000
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/hwmon/*
grep: /sys/class/drm/card1/device/hwmon/hwmon2: Is a directory
garrett@garrett-desktop:~$ grep . /sys/class/drm/card1/device/hwmon/hwmon2/*
grep: /sys/class/drm/card1/device/hwmon/hwmon2/device: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_enable:0
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_input:881
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_max:4500
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_min:0
/sys/class/drm/card1/device/hwmon/hwmon2/fan1_target:881
/sys/class/drm/card1/device/hwmon/hwmon2/freq1_input:1233000000
/sys/class/drm/card1/device/hwmon/hwmon2/freq1_label:sclk
/sys/class/drm/card1/device/hwmon/hwmon2/freq2_input:300000000
/sys/class/drm/card1/device/hwmon/hwmon2/freq2_label:mclk
/sys/class/drm/card1/device/hwmon/hwmon2/in0_input:1018
/sys/class/drm/card1/device/hwmon/hwmon2/in0_label:vddgfx
/sys/class/drm/card1/device/hwmon/hwmon2/name:amdgpu
grep: /sys/class/drm/card1/device/hwmon/hwmon2/power: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/power1_average:22143000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_default:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_max:120000000
/sys/class/drm/card1/device/hwmon/hwmon2/power1_cap_min:0
/sys/class/drm/card1/device/hwmon/hwmon2/power1_label:slowPPT
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1:0
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_enable:2
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_max:255
/sys/class/drm/card1/device/hwmon/hwmon2/pwm1_min:0
grep: /sys/class/drm/card1/device/hwmon/hwmon2/subsystem: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_crit:94000
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_crit_hyst:-273150
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_input:54000
/sys/class/drm/card1/device/hwmon/hwmon2/temp1_label:edge
theofficialgman commented 2 years ago

I will say, something seem to not be working correctly with gpu load in system monitoring center I am loading the gpu and it does not report much and it is prone to spikes.

compare that to radeontop where the load is constant image

I looked at the code for radeontop (found on github) and its not using anything I'm familiar with, its definitely not using gpu_busy_percent to get the data, that is for certain

theofficialgman commented 2 years ago

if I do a watch -n 0.1 cat /sys/class/drm/card1/device/gpu_busy_percent I see that the percentage changes rapidly between 0 and 100 many times per second. so maybe a rolling average of a few data points is all that is needed to get a better estimate for load

theofficialgman commented 2 years ago

I also tried the 1.12 deb and I don't see any load or frequency output on the tegra. Screenshot from 2022-04-22 13-42-12

theofficialgman commented 2 years ago

ok I debugged the code and found an error in the vendor id code

print(self.device_vendor_id) if I print this, I see the ouput is Nvidia, its not an id number.

this means the later if does not get entered:

        # If selected GPU vendor is NVIDIA and selected GPU is used on an ARM system.
        if self.device_vendor_id == "v000010DE" and gpu_device_path.startswith("/sys/devices/") == True:

my modalias file has nvidia written directly in it of:NgpuT<NULL>Cnvidia,tegra210-gm20bCnvidia,gm20b so the vendor id returns Nvidia, you forgot to check for this when using it

hakandundar34coding commented 2 years ago

It looks like measurement for GPU load (in the file in card folder) is performed for a very small time for the AMD GPU. Average is required. You also wrote this.

theofficialgman commented 2 years ago
        # If selected GPU vendor is NVIDIA and selected GPU is used on an ARM system.
        if self.device_vendor_id in ["v000010DE", "Nvidia"] and gpu_device_path.startswith("/sys/devices/") == True:

this is enough to enter the if for both cases however the gpu_device_path does not end with a / so there are a lot of errors

theofficialgman commented 2 years ago

anyway, just fix this as well to look like this:

        # Try to get GPU list from "/sys/devices/" folder which is used by some ARM systems with NVIDIA GPU.
        for file in os.listdir("/sys/devices/"):

            if file.split(".")[0] == "gpu":
                self.gpu_list.append(file)
                self.gpu_device_path_list.append("/sys/devices/" + file + "/")

and then all is good

theofficialgman commented 2 years ago

I'll just send the edited Gpu.py file just so its all clear. There are some extra print statements for logging, so you will see that if you diff with your branch. Gpu.txt

Screenshot from 2022-04-22 14-12-34

hakandundar34coding commented 2 years ago

Is average required for Tegra GPUs?

theofficialgman commented 2 years ago

Is average required for Tegra GPUs?

tegrastats uses the load number directly with no average. it won't hurt to have one, but that is up to you.

hakandundar34coding commented 2 years ago

Are there any gpu monitoring tools installed with GPU driver for AMD cards (on your system)?

theofficialgman commented 2 years ago

no gpu monitoring tools come with the AMD mesa gpu driver.

I installed radeontop separately (thats a project not affiliated with mesa). radeon profile is also a separate project which I installed additionally.

hakandundar34coding commented 2 years ago

I will update the code tomorrow. GPU load for AMD GPUs will be fixed. Also line numbers in the frequency information will be removed and GPU power information will be shown.

hakandundar34coding commented 2 years ago

Can you run this Python file and compare the values with the values from radeontop? gpu_load_amd.txt

It reads the load file 50 times in a second and calculates average of the values. You can increase the number of cycles (range(50)).

theofficialgman commented 2 years ago

its the same how you have written it...

in my testing, on my system, the 50 loop completes in 0.004618356 seconds which is way to fast to get accurate readings.

i did some while true; do cat /sys/class/drm/card1/device/gpu_busy_percent; echo $EPOCHREALTIME; done and found that the values complete a cycle between 0 and 100 about once every 16 milliseconds.... I should have expected this. the gpu load is in step with the frequency which the game runs at (60hz, which is 16 milliseconds). so if you sample the gpu load as much as you can over the whole 16 milliseconds, you will get a very accurate reading.

"a good enough solution" for this usecase to simple sleep inbetween each reading and sample across the whole cycle period. gpu_load_amd.txt

if you want to be compatible with higher refresh rate games/higher frequency loading. you will need to lower the sleep time and increase the number of samples.

hakandundar34coding commented 2 years ago

Several bugs have been fixed.

Can you share a screenshot of the GPU tab for the AMD GPU?

A new package (v1.12.1) will be installable today if there is no problem.

About GPU load: Number of cycles is set as 370. Because there some monitors which have 360 Hz refresh rate. This is a quick fix. There may be a more detailed solution.

I do not know if the 60 Hz value which you wrote is screen refresh rate or refresh rate of the content (game, etc.). There may be a 165 Hz monitor but a game may have a 60 FPS limit. Game may be run fullscreen or not. I do not know if this affects the load writing (gpu_busy_percent) frequency.

theofficialgman commented 2 years ago

@hakandundar34coding actually gpu load, temperature, and refresh rate have disappeared on my amd card now

image

theofficialgman commented 2 years ago

and yes, I have checked, the only difference is the sensor moved to hwmon4 from the hwmon2 last time

grep . /sys/class/drm/card1/device/hwmon/hwmon4/*
grep: /sys/class/drm/card1/device/hwmon/hwmon4/device: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_enable:1
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_input:748
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_max:4500
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_min:0
/sys/class/drm/card1/device/hwmon/hwmon4/fan1_target:748
/sys/class/drm/card1/device/hwmon/hwmon4/freq1_input:1233000000
/sys/class/drm/card1/device/hwmon/hwmon4/freq1_label:sclk
/sys/class/drm/card1/device/hwmon/hwmon4/freq2_input:300000000
/sys/class/drm/card1/device/hwmon/hwmon4/freq2_label:mclk
/sys/class/drm/card1/device/hwmon/hwmon4/in0_input:1018
/sys/class/drm/card1/device/hwmon/hwmon4/in0_label:vddgfx
/sys/class/drm/card1/device/hwmon/hwmon4/name:amdgpu
grep: /sys/class/drm/card1/device/hwmon/hwmon4/power: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/power1_average:27068000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_default:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_max:120000000
/sys/class/drm/card1/device/hwmon/hwmon4/power1_cap_min:0
/sys/class/drm/card1/device/hwmon/hwmon4/power1_label:slowPPT
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1:51
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_enable:1
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_max:255
/sys/class/drm/card1/device/hwmon/hwmon4/pwm1_min:0
grep: /sys/class/drm/card1/device/hwmon/hwmon4/subsystem: Is a directory
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_crit:94000
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_crit_hyst:-273150
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_input:50000
/sys/class/drm/card1/device/hwmon/hwmon4/temp1_label:edge

also gpu_busy_percent is still there... so I have no clue why your gui is not working anymore /sys/class/drm/card1/device/gpu_busy_percent

I double checked, previous version 1.12.0 has working temperature, refresh rate, and gpu usage (except for not averaged)

theofficialgman commented 2 years ago

actually refresh rate still works, it just doesn't work when using dual monitors in wayland

theofficialgman commented 2 years ago

found the temperature bug: you recently added this line, it does not show if you have this

https://github.com/hakandundar34coding/system-monitoring-center/blob/4c6eb8dd3e65c2f70eca48bb12c5387879ad49f1/src/Gpu.py#L399-L400

that else happens after a for loop... pretty sure that is not valid python syntax

theofficialgman commented 2 years ago

I added some prints, and its not starting the amd_gpu_load_func

            try:
                self.amd_gpu_load_func()
                gpu_load = f'{(sum(self.amd_gpu_load_list) / len(self.amd_gpu_load_list)):.0f} %'
                print(gpu_load)
            except Exception:
                gpu_load = "-"
                print(gpu_load)

- gets printed to terminal each second... so the exception is called

theofficialgman commented 2 years ago

alright, commented out the try, here is the bug

  File "/usr/share/system-monitoring-center/src/Gpu.py", line 366, in gpu_load_memory_frequency_power_func
    self.amd_gpu_load_func()
  File "/usr/share/system-monitoring-center/src/Gpu.py", line 548, in amd_gpu_load_func
    with open(gpu_device_path + "device/gpu_busy_percent") as reader:
NameError: name 'gpu_device_path' is not defined

you forgot to add

        selected_gpu_number = self.selected_gpu_number
        selected_gpu = self.gpu_list[selected_gpu_number]
        gpu_device_path = self.gpu_device_path_list[selected_gpu_number]
        gpu_device_sub_path = self.gpu_device_sub_path_list[selected_gpu_number]

to the amd_gpu_load_func

even with these changes... I now only get 0% load

if I add a print into that function, it does return values of gpu load non-zero, so now the averaging part must be wrong

        # Read file to get GPU load information. This information is calculated for a very small time (screen refresh rate or content (game, etc.) refresh rate?) and directly plotting this data gives spikes.
        with open(gpu_device_path + "device/gpu_busy_percent") as reader:
            gpu_load = reader.read().strip()
            print(gpu_load)
0
56
8
0
0
0
0
18
89
theofficialgman commented 2 years ago

oh, lol easy fix, why are you dividing by 1000?

        self.amd_gpu_load_list.append(float(gpu_load)/1000)

should just be

        self.amd_gpu_load_list.append(float(gpu_load))

with all my above fixes implemented, now we have a working GUI image