Open bbklopfer opened 9 months ago
Have similar experience with v1.29 and current v1.33 Desktop on Opi5-Plus. The load avg is >1 even at idle for significant time (>20min( not sure it will go down <1 if let the system idle longer.
Installed joshua's kernel 5.10.160-28 or 5.10.160-30 on Archlinux the same high load avg >1 exist. at idle Use vendor's kernel 5.10.110-2 or 5.10.160 the load avg is much lower at idle.
Kind a funny to see the very same bug over the years across various SoC (not only Rockchip). And yes, I experience this as well. However I did not test against vendor images but against mainline 6.8-rc which seems fine.
Anyway, just a guess. Does the load disappear when the NVMe is removed and booted from SD or eMMC?
Does the load disappear when the NVMe is removed and booted from SD or eMMC?
Boot up from NVMe and SD card same high load avg with Joshua's Kernel. In boot case NVMe and EMMC (256GB but empty) is already installed and not remove when boot with SD Card.
Joshua image is stable and so far very good experience on Opi5-Plus.
Did some tests myself. I doesn't seem to be related to the used type of storage. Tried NVMe, eMMC and SDcard in all combinations either as rootfs or just plain installed. Load always raises to >=1. Bummer...
@EvilOlaf how was your experience with the 6.8-rc kernels, and where did you grab them from (looks like Armbian offers 6.8-rc1, or did you just roll your own)?
My whole reason for wanting to try a new kernel was due to this issue I was experiencing.
Was from Armbian.
@EvilOlaf how was your experience with the 6.8-rc kernels, and where did you grab them from (looks like Armbian offers 6.8-rc1, or did you just roll your own)?
My whole reason for wanting to try a new kernel was due to this issue I was experiencing.
Mainline Linux 6.8 just got HDMI introduced and not all of the hardware is working, specifically the GPU and VPU. You will not be able to run Jellyfin with hardware acceleration if you plan on going down this road. Even when GPU or VPU support comes into mainline Linux I would expect there to be many issues as this bleeding edge software.
This forces most users to use the crappy 5.10 Android kernel. I likely will not look into the load average issue as the kernel is a mess and it's way too much work on a kernel that will likely be dead in a year from now.
Got it --- thanks for chiming in!
I will keep this open but add a wont fix
tag as it's a valid issue.
@Joshua-Riek just curious. Whats your opinion of rkr7.1 (5.10.198 I think?) or 6.1 bsp? Is noticed you played with former just a bit and abandoned it.
I think rkr 7.1 is fine and see no breaking changes, I may bump to this kernel in the future for legacy reasons. As for 6.1 I still do not have the release tag for it. I've started to do some work on the 6.1 kernel from an old snapshot i got back in late October, but i really want a release tag before spending a lot of time inito it.
Gotcha.
I think rkr 7.1 is fine and see no breaking changes, I may bump to this kernel in the future for legacy reasons. As for 6.1 I still do not have the release tag for it. I've started to do some work on the 6.1 kernel from an old snapshot i got back in late October, but i really want a release tag before spending a lot of time inito it.
As far as I know, JeffyCN's kernel-6.1-2024_01_02 tag is the first release of 6.1 bsp. OrangePi also updated their kernel tree not long ago, which also confirmed this.
https://github.com/orangepi-xunlong/linux-orangepi/tree/orange-pi-6.1-rk35xx
I would still like to see a release tag, but this looks good. I will likely create a fork from this point and start to rebase stuff.
I dropped WiFi patches, LCD panel patches, and some changes for the Khadas Edge. Because I went through about 200 patches with a ton of merge conflicts, I could have made a few mistakes. But here is the current progress, should be an OK starting point.
https://github.com/Joshua-Riek/linux-rockchip/commits/rockchip-6.1/
Some non-essential peripherals should have lower priority if they cannot be easily ported to 6.1.
Btw I dropped the r8125 out-of-tree driver. The original one is a bit outdated.
Hey @nyanmisaka, do you have gnome wayland working with the 6.1 kernel? I just finished some testing and only X11 would start :thinking:
Hey @nyanmisaka, do you have gnome wayland working with the 6.1 kernel? I just finished some testing and only X11 would start 🤔
I haven't tried panfork on the 6.1 kernel. But I know that libmali can provide Wayland support for Gnome on Ubuntu 23.10 mantic.
So might be worth going the noble route directly?
The problem may be whether panfork itself is compatible with the updated panfrost kernel mode driver in 6.1 and the new mali csf firmware, rather than the distro version.
I just tested Noble and it seems to use llvmpipe sadly, I'll need to try with your 6.1 fork directly with Armbian mantic. Does glmark2 use hw accel in your OS?
glmark2-wayland
requires full OpenGL but libmali only provide GLES. glmark2-es2-wayland
works. And the desktop is still accelerated by kworker/u17:1-mali_kbase_csf_sync_upd
Applications requiring full OpenGL will not be accelerated.
https://github.com/tsukumijima/libmali-rockchip/releases/tag/v1.9-1-b5d7972
I did test panfork and wayland did not work as mentioned before, then crashed a bit later with the below logs, I've not done much debugging yet:
Feb 7 21:27:53 ubuntu-desktop kernel: [ 24.302826] mali fb000000.gpu: Loading Mali firmware 0x1010000
Feb 7 21:27:53 ubuntu-desktop kernel: [ 24.305300] mali fb000000.gpu: Mali firmware git_sha: ee476db42870778306fa8d559a605a73f13e455c
Feb 7 21:27:53 ubuntu-desktop kernel: [ 24.737056] mali fb000000.gpu: Invalid CPU access to UMM memory for ctx 1227_0
Feb 7 21:31:20 ubuntu-desktop kernel: [ 232.566709] mali fb000000.gpu: Invalid CPU access to UMM memory for ctx 1272_1
Feb 7 21:31:21 ubuntu-desktop kernel: [ 234.137685] mali fb000000.gpu: Invalid CPU access to UMM memory for ctx 3244_19
Apparently this is Mali bifrost in the kernel complaining, and panfork doesn't work well with it. You can try downgrading it from g21p0 to g18p0.
https://github.com/JeffyCN/mirrors/commits/kernel-6.1-2024_01_02/drivers/gpu/arm/bifrost
Yeah, the 6.1 kernel does not like panfork very much. I think it may be better to work on backporting panthor.
We tried it last year on 6.1.25 (snapshot from October) but panthor was still unstable at that time. I think there is no need to waste any more time until panthor and Mesa PR are merged into the mainline.
Just a feedback. The kernel https://repo.bredos.org/rkr6/linux-rockchip-rkr6-5.10.160-6-aarch64.pkg.tar.zst running in Plasma Wayland or Plasma X11 session also have the similar high average load greater than 1 even when idling. So it is not unique to Joshua's kernel-5.10.160.
I 'm not sure, maybe we can upgrade mali driver mali_csffw.bin
instead of downgrading kernel use from g21p0 to g18p0.
g21p0 breaks wayland sadly, i did try to revert the commit and use g18p0 on 6.1.
Oh, cause I use kde, which set x11 as default, so I don't notice the wayland poor performance.
Operating System: Arch Linux ARM
KDE Plasma Version: 5.27.10
KDE Frameworks Version: 5.115.0
Qt Version: 5.15.12
Kernel Version: 6.1.43-2-rkbsp (64-bit)
Graphics Platform: X11
Processors: 4 × ARM Cortex-A55, 4 × ARM Cortex-A76
Memory: 15.6 GiB of RAM
Graphics Processor: Mali-G610
Product Name: Orange Pi 5 Plus
I try to use g21p0 with kernel 5.10 but it seems they are not compatible with each. So sorry it didn't help.
Oh, cause I use kde, which set x11 as default, so I don't notice the wayland poor performance.
Hi @wyf9661, Joshua had already got Panfork to work on Wayland Session. Hope you be update it on your linux-rkbsp-6.1.43 on for Arch Linux too.
I will be using the below branch for my 6.1 progress. I think there may be a mpp issue as chrome is not using the GPU properly.
https://github.com/Joshua-Riek/linux-rockchip/tree/rk-6.1-rkr1
Hi @wyf9661, Joshua had already got Panfork to work on Wayland Session. Hope you be update it on your linux-rkbsp-6.1.43 on for Arch Linux too.
linux-rkbsp-6.1.43
follows the rk kernel upstream with only necessary patches. I don't think wayland works well on kde until kde6 comes out. you can build yourself if needed by using git version that follows from joshua with his effort of commits.
linux-rkbsp-6.1.43
follows the rk kernel upstream with only necessary patches.
Hi @wyf9661, Just checking whether you will restart building/release "linux-rkbsp-joshua-git-6.1" as Joshua is now actively developing bsp-kernel-6.1.
linux-rkbsp-6.1.43
follows the rk kernel upstream with only necessary patches.Hi @wyf9661, Just checking whether you will restart building/release "linux-rkbsp-joshua-git-6.1" as Joshua is now actively developing bsp-kernel-6.1.
I maintain this pkgbuild and update it, but do not build and release it.
This is getting pretty off-topic. I suggest to move the conversation about 6.1 into a separate issue or discussion.
Is the high power consumption due to the kernel's Wi-Fi driver?
Is the high power consumption due to the kernel's Wi-Fi driver?
I don't know. What do you recommend how to investigate deeper? top
certainly doesn't cut it.
Unfortunately the loadavg issue is still present in 6.1.y kernel.
Compared with 5.10 bsp, this time the 6.1 bsp is already a relatively clean Android kernel. I suspect a specific device driver or hack is causing this problem. It may require some tracing.
Doesn't seem to be wifi related. built and installed a custom kernel with all wifi and bt stuff disabled. Also disabled bt and wifi in device tree.
I suspect this may be related to the desktop environment. I've been using the rk3588 as a headless server (no monitor attached) and the load average is close to 0 when idle.
nyanmisaka@nanopct6:~$ uname -a
Linux nanopct6 6.1.43-legacy-rk35xx #45 SMP Sun Jan 7 06:29:26 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
nyanmisaka@nanopct6:~$ neofetch
nyanmisaka@nanopct6
-------------------
â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ OS: Armbian (23.11.0-trunk) aarch64
███████████████████████ Host: FriendlyElec NanoPC-T6
▄▄██ ██▄▄ Kernel: 6.1.43-legacy-rk35xx
▄▄██ ███████████ ██▄▄ Uptime: 7 hours, 9 mins
▄▄██ ██ ██ ██▄▄ Packages: 1615 (dpkg), 9 (snap)
▄▄██ ██ ██ ██▄▄ Shell: bash 5.2.15
▄▄██ ██ ██ ██▄▄ Terminal: /dev/pts/0
▄▄██ █████████████ ██▄▄ CPU: (8) @ 1.800GHz
▄▄██ ██ ██ ██▄▄ Memory: 906MiB / 15716MiB
▄▄██ ██ ██ ██▄▄
▄▄██ ██ ██ ██▄▄
▄▄██ ██▄▄
███████████████████████
â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ â–ˆ
Negative. I tried with preinstalled-server image and the issue persists.
Seems like so far only the OPi5+ is affected. 5 is not and your nanopc-t6 isn't as well.
Weird. I can't see correlations between the circuit board design and this issue.
Just a observation: Mainline does not have this issue. 6.7.y and 6.8-rc are fine.
Can this also be reproduced in the OEM image provided by orangepi? It's best to report it directly to them.
Had tested Orangepi lastest OPIOS-Arch (2024.01) on Opi5-Plus with kernels-linux-rk35xx-legacy-5.10.160-1 (https://mirror.orangepi.dev/archlinux/stable/aarch64/opios-core/linux-rk35xx-legacy-5.10.160-1-aarch64.pkg.tar.zst).
It does not seem to have the high loadavg issue.
I just did the same with my fresh Opi5+ and just running the preinstalled server image from sd card, the cpu and the nvm ssd is hot and load is high.
I fail to understand the current status of this? it is a wont-fix? but open? what blocks us from trying to solve the issue?
Because this is not causing a major issue such as a system crash, I really do not want to spend a few days digging into the kernel to fix it. I have many other ubuntu related tasks and improvements that are being worked on.
Hi,
Booted
ubuntu-22.04.3-preinstalled-server-arm64-orangepi-5-plus
on my 5 Plus. Noticed a few things (in comparison to the Orange Pi issued Debian image):sensors
) are about ~8C higher, which qualitatively makes sense given the power consumptionload average: 1.00, 1.00, 0.80
(with the 15m average slowly creeping up)I am booting from the SD card, eMMC is connected to system and has the Orange Pi Debian image. I tried running the ubuntu-rockchip kernel with the Orange Pi Debian userspace, and I get the same power/thermal/load average results, so it seems (?) like it's a kernel issue.
System has an NVME SSD installed, no wifi card.
Happy to provide any additional info. Thanks!