raspberrypi / linux

Kernel source tree for Raspberry Pi-provided kernel builds. Issues unrelated to the linux kernel should be posted on the community forum at https://forums.raspberrypi.com/
Other
10.96k stars 4.93k forks source link

Kernel 5.0.0-rc8: high cpu load #2881

Closed mkreisl closed 4 years ago

mkreisl commented 5 years ago

Describe the bug Average load is always higher than 6 on idle. Running kernel 4.20 and below load average is less than 0.2 (CPU is > 98% idle)

To reproduce Boot kernel 5.0.0-rc8

I know kernel 5.0 is still RC, but It can not be long before the final version is released

lategoodbye commented 5 years ago

Do you see any suspicious? Does this load depend on the Ethernet link?

mkreisl commented 5 years ago

Do you see any suspicious? Does this load depend on the Ethernet link?

No, everything else seems to be normal. Have not tested without Ethernet because its needed due to root fs is on iSCSI target

Will test it on a Pi3B today

mkreisl commented 5 years ago

Another test, Pi3B, no network, no sd-card (dtoverlay=sdtweak,poll_once in /boot/config.txt), root fs on usb disk:

Average load is exactly 4.00. Hmmm, 4 cores == 4.00 ???

pelwell commented 5 years ago

FYI I'm seeing the same effect, also with a load of 4 - no theory yet as to what is going on.

lategoodbye commented 5 years ago

Just to clarify you take the average load as the first number (left to right) from top? How high is the CPU usage?

mkreisl commented 5 years ago

Just to clarify you take the average load as the first number (left to right) from top? How high is the CPU usage?

As already written in my first post: CPU is > 98% idle

pelwell commented 5 years ago

The idle load average drops to 3 if three VCHIQ commits are reverted, but sadly this isn't the magic bullet I thought it might be.

pelwell commented 5 years ago

The idea was correct, it just didn't go far enough. The problem is caused by a switch from _interruptible to _killable waits, which doesn't increase the "busy"ness of the processor but does affect the load average accounting. ps aux will show 4 vchiq threads in a 'D' state - they are the cause of your 4.0.

I'm in the middle of a messy revert, but I should have something to test later today or tomorrow.

lategoodbye commented 5 years ago

Here are my current test results for the following scenario boot Rasperry Pi 3B, Ethernet connected, Wifi enabled but not connected and wait until 2 min uptime: linux-5.0.0, multi_v7_defconfig: load 3,04 linux-5.0.0, bcm2835_defconfig: load 3,80 linux-4.19.23, multi_v7_defconfig: load 0,12 linux-4.19.23, bcm2835_defconfig: load 0,25

This would exclude a configuration change in 5.0.0 (no relevant changes in bcm2835_defconfig to 5.0.0 AFAIK). Looks like this regression has been introduced in mainline. Maybe i'm in the mood for bisecting ...

pelwell commented 5 years ago

It's the three patches from Nicholas and the one from Arnd (the messy reversion, due to the removal of the typedefs).

lategoodbye commented 5 years ago

@vianpl Could you please take a look at this?

pelwell commented 5 years ago

I've pushed the four reversions to rpi-5.0.y, and it has had the desired effect:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 5.0.0-rc8-v7+ #1073 SMP Wed Mar 6 21:22:25 GMT 2019 armv7l GNU/Linux
pi@raspberrypi:~ $ uptime
 21:30:49 up 6 min,  3 users,  load average: 0.00, 0.04, 0.02
vianpl commented 5 years ago

@lategoodbye Thanks for pointing it out I'll look into it.

jcberthon commented 5 years ago

Thank you very much @pelwell

It seems to be resolve (I also had the problem with rc8):

$ uptime
 22:21:29 up 2 min,  1 user,  load average: 1.20, 0.74, 0.29
vianpl commented 5 years ago

The idea was correct, it just didn't go far enough. The problem is caused by a switch from _interruptible to _killable waits, which doesn't increase the "busy"ness of the processor but does affect the load average accounting. ps aux will show 4 vchiq threads in a 'D' state - they are the cause of your 4.0.

I agree, the issue is not so much a performance regression as how Linux accounts it's usage statistics. One could argue it's a cosmetic thing.

We actually went for that family of wait functions as we where trying to mimic the original down_interruptible implementation in vchiq, which was implementing a custom _killable (see vchiq_killable.h):

#define SHUTDOWN_SIGS   (sigmask(SIGKILL) | sigmask(SIGINT) | sigmask(SIGQUIT) | sigmask(SIGTRAP) | sigmask(SIGSTOP) | sigmask(SIGCONT))

static inline int down_interruptible(struct semaphore *sem)
{
[...]
        siginitsetinv(&blocked, SHUTDOWN_SIGS);
        sigprocmask(SIG_SETMASK, &blocked, &oldset);
       down_interruptible(sem);
[...]
}

In the case of vchiq's implementation the issue didn't exist as the task was marked as TASK_INTERRUPTIBLE even though it more or less behaved like a killable (usually marked as TASK_UNINTERRUPIBLE which triggers the D state).

Also, I had a look at what really wakes up a linux killable task and it's just SIGKILL (see __fatal_signal_pending() in signal.h). So we where not mimicking the exact same behavior.

I think the next step would be to revamp these patches using the interruptible family of functions. I already tested it in the past and there was no obvious issues. If there were any in the long run, we could go back to killable, or something in the middle, but with a proper justification for it. Which would be nice to have if the driver is ever to come out of staging with strange concurrency primitives.

That said, I'd like to hear your opinion first :wink: .

pelwell commented 5 years ago

I'm fairly pragmatic - I'd be happy with anything that satisfied the kernel maintainers' quest for cleaner code without compromising user experience.

lategoodbye commented 5 years ago

@vianpl Any progress on this?

vianpl commented 5 years ago

@lategoodbye I'll try to move things forward tomorrow.

Xaositek commented 5 years ago

Currently running 5.0.0-1006-raspi2 on my Ubuntu 19.04 Beta system and seeing this - I apologize for the newbie question but when is this fix expected to be available? Or if available how can I get it?

popcornmix commented 5 years ago

It is just a cosmetic issue. The high load reported doesn't actually consume any CPU.

Xaositek commented 5 years ago

The splash screen is minor affected but understood it is not performance impacting...

Welcome to Ubuntu 19.04 (GNU/Linux 5.0.0-1006-raspi2 armv7l) .... System information disabled due to load higher than 4.0

pelwell commented 5 years ago

The commits responsible are already reverted in the downstream rpi-5.0.y tree, but I think Ubuntu builds from upstream.

westonmyers commented 5 years ago

Arch Linux ARM builds from upstream, and Debian does as well.

This isn't "cosmetic" as others have noted, in my opinion. Monitoring software, k8s orchestrators, etc. are coded such that this would be tripping alarms/messing with scaling 24/7. It just goes against general Linux administration. E.g. A quad core system at a load of >4.0 indicates a fully loaded system (Which simply isn't the case here). So I've had to compile and patch this reversion in for my boxes.

Anyways, to help some of the non-contributors out here; below is a link to the discussion that has occurred in upstream showing their thought processes.

(Use "Next message (by thread):" to progress through.) [PATCH 0/3] staging: vchiq: use interruptible waits

lategoodbye commented 5 years ago

I hope nobody gets offend by my reply, but such messages makes me grumpy. First i apologize that i usally don't look at load during tests, mostly i'm happy if linux-next still boots on BCM2835 (no irony or joke). The responsible commit was merged in 12.12.2018 with a lot of other vchiq stuff in STAGING. Until now there was no complain about this on linux-rpi-kernel. After release of Linux 5.0 it is becoming harder to get things fixed and i cannot promise Nicolas series will be applied to stable. Linux development doesn't work without testers.

So here are my wishes to the upstream users for the future:

Thanks for using the upstream kernel

Xaositek commented 5 years ago

Have you gotten any traction on moving this forward into upstream kernel?

Xaositek commented 5 years ago

Appears we did get a kernel update in the repo but it doesn't have this fix integrated yet.

$ uname -a Linux ubuntu-pi 5.0.0-1008-raspi2 #8-Ubuntu SMP Tue May 7 08:47:05 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

lategoodbye commented 5 years ago

The fixes are now in staging-next and will be scheduled for Linux 5.3.

pelwell commented 5 years ago

See the reversions: https://kernel.googlesource.com/pub/scm/linux/kernel/git/gregkh/staging/+/061ca1401f96c254e7f179bf97a1fc5c7f47e1e1 https://kernel.googlesource.com/pub/scm/linux/kernel/git/gregkh/staging/+/086efbabdc04563268372aaef4d66039d85ee76c and the fix: https://kernel.googlesource.com/pub/scm/linux/kernel/git/gregkh/staging/+/77cf3f5dcf35c8f547f075213dbc15146d44cc76

mritd commented 5 years ago

Is there any safe way to roll back to the 4.x kernel?

popcornmix commented 5 years ago

sudo apt install --reinstall raspberrypi-kernel raspberrypi-bootloader

should get you back to stable firmware/kernel (currently 4.19).

mritd commented 5 years ago

Unfortunately, I use arch os... I have not found a way to safely roll back to the old version of the kernel (I just installed the latest version of arch os, there is no cache of the previous version of the kernel installation package) 😂

popcornmix commented 5 years ago

I think you'll need to ask the arch maintainers for the recommended way.

You can use rpi-update <hash> to go back to a specific kernel version, or you can build your own, but I don't know if arch builds with additional config options or requires a specific kernel version for functionality, so I can't recommend that.

westonmyers commented 5 years ago

Again, your issue is distribution specific. That said, I'm a fellow Arch user so:

You're wanting the functionality of the Arch Linux Archive The archlinuxarm project doesn't run an archive service though. Luckily, a kind community member does.

I would highly suggest treading carefully when dealing with package version missmatch on Arch though. It's very easy to end up in a non-booting state. So, you may want to focus on the "How to restore all packages to a specific date" part of that wiki page to get a scope of what all may be different involving that kernel/libs at X time.

Any more than this and you really need to be seeking out the Arch community at large as you wanting to move to a different kernel version isn't this repo's forte/what-have-you.

mritd commented 5 years ago

@westonmyers Thanks for your reply, I think I should wait for the 5.3 release of the kernel; or use the dd command to back up my arch system (I feel that there is a high probability of rollback failure😂)...

ropil commented 5 years ago

@westonmyers you are correct, it's not purely a cosmetic issue; the device is running hot, implicating that perhaps the GPU, or something else computational, is active.

I haven't found any way to get the GPU load on the rpi; got vcgencmd to compile, but the tool didn't help me - perhaps someone more knowledgeable would be able to find out more? See nezticle - VideoCore-Tools and elinux.org

sensors report a stable temperature of >60C at idle (with 4.0 load from top), whilst other sources state that idle temperature at corresponding ambient should be <50C; clearly the device is computing, generating heat, and thus the issue is not at all cosmetic. See rpi 3 temperatures @ stackexchange.com

ropil commented 5 years ago

After a downgrade to a 4.19. kernel, the issues with load and bad IO performances resolves, but the perceived heat abnormality remains; suggesting that either old consensus on idle temperature of <50C does not apply in this case, or that there are invisible/unintentional operations active on the device at idle - even on 4.19. kernels.

Experiments suggests that the VCHIQ issue is interfering with kernel internals or IO operations. Experimented with setting up a raid5 LVM on USB 2.0 attached pen-drives and exporting it via NFS over LAN, reaching write speeds of mere 0.03 MB/s and problems with slow synchronization on the 5. kernel. On the 4.19. kernel, the same setup has write speeds exceeding 1.0 MB/s and the LVM raid5 synchronization is seamless, suggesting the VCHIQ issues are hampering either kernel internal operations or USB IO.

But even with the better performance, and 0 loads upon idle, the idle temperature is >60C which is far above the historical (2016-2017) idle temperatures (<50) reported elsewhere for Raspberry Pi 3. Now, given that my device is a Raspberry Pi 3B+, my idle temperatures might be higher due to differences in hardware. Contrary to this, my private logs state temperatures at <50C in january/february 2019 - although I have yet to reproduce these idle temperatures. Only upon reproducing lower idle temperatures on the Raspberry Pi 3B+, one can be sure whether there are invisible/unintentional operations on the board.

$ /opt/vc/bin/vcgencmd measure_temp temp=61.8'C

Operative system used is the ArchLinuxARM distribution, two versions, provided from

sedlund commented 5 years ago

After testing a 5.2 Kernel from the latest Manjaro I can see with all 4 cores at 100% that my PI3B+ will not draw more than 2.77 watts (it idles at 2.46 watts) and not go over 55 deg C. CPU Benchmarks are much slower than a 4.19 kernel in Raspbian. In Raspbian with 4 cores at 100% it will pull 5.3 watts and reach temps over 65 deg C.

I am wondering if the false CPU load average is affecting the kernel scheduler allowing it to ramp up.

OpenSSL 1.1.1c openssl speed -evp aes-128-cbc Manjaro 5.2 - 27574.04k Raspbian Buster 4.19.57-v7+ - 56836.10k

~2x fold reduction in OpenSSL speed

ropil commented 5 years ago

After testing a 5.2 Kernel from the latest Manjaro I can see with all 4 cores at 100% that my PI3B+ will not draw more than 2.77 watts (it idles at 2.46 watts) and not go over 55 deg C. [...] In Raspbian with 4 cores at 100% it will pull 5.3 watts and reach temps over 65 deg C.

@sedlund what wattage does your board draw when idle on Raspbian with 4.19 kernel? If not possible in Raspbian, if you could try the ArchlinuxARM distro linked above, for rpi-2; that one is at least idling at 0 loads according to top.

sedlund commented 5 years ago

@sedlund what wattage does your board draw when idle on Raspbian with 4.19 kernel?

About 2.77 watts - but it in that configuration it has an active usb WiFi device and the onboard WiFi is running hostapd.

lategoodbye commented 5 years ago

The initial regression should be fixed now upstream in 5.1 and 5.2.

Everything else should be discussed in a separate issue.

westonmyers commented 5 years ago

To be more specific, 5.2.1/5.1.18 has the reversion.

sedlund commented 5 years ago

I've tested 5.2.1 it resolves the abnormal high load average and the vchiq proceses in D state.

It did not resolve the issue of the CPU frequency not ramping up to 1.4GHz though. It seems like it is stuck at 700MHz only drawing 2.8 watts under full 4 core 100% load.

vianpl commented 5 years ago

@sedlund CPU frequency scaling support will only be available in kernel version 5.3. So that was to be expected.

ropil commented 5 years ago

@westonmyers the fix resolves the initial regression on ArchlinuxARM as well.

@lategoodbye if you know the place, can you direct me to the separate issue discussing high idle temperatures?

$ uname -a;echo;uptime;echo;sensors
Linux omega 5.2.1-1-ARCH #1 SMP Sun Jul 14 19:29:00 UTC 2019 aarch64 GNU/Linux

 10:37:15 up 21 min,  3 users,  load average: 0.00, 0.00, 0.03

cpu_thermal-virtual-0
Adapter: Virtual device
temp1:        +60.1°C  (crit = +80.0°C)

rpi_volt-isa-0000
Adapter: ISA adapter
in0:              N/A

EDIT: the temperature is power consumption dependent - removing USB-devices lowered temperature 2-4 C. I'll read up on the matter, post my results in appropriate channels, and link it here for reference to my own comments - for the sake of clarity.

lategoodbye commented 5 years ago

@ropil I'm not aware of a discussion about high idle temperatures. This repository is dedicated to the Raspberry (downstream) kernel. Arch is using the mainline kernel for Aarch64. So there are two options: 1) Report this issue to linux-rpi-kernel@lists.infradead.org (moderated list) 2) Report this issue to https://archlinuxarm.org/forum/

It doesn't make sense to compare the downstream with the mainline kernel.

markind69 commented 5 years ago

Thanks for this info. I am troubleshooting this issue too, thanks for being on it:

~$ uname -a;echo;uptime;echo;sensors Linux ausvrl8235 5.0.0-1006-raspi2 #6-Ubuntu SMP Thu Apr 11 18:04:26 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

15:47:45 up 7 days, 22:01, 1 user, load average: 4.19, 4.05, 4.02

cpu_thermal-virtual-0 Adapter: Virtual device temp1: +54.8°C

lategoodbye commented 5 years ago

Please close this issue, because the initial issue has been fixed up- and downstream.

Btw a temperature with a kernel version is pretty pointless.

I need to at least for good AND bad case the following information (assuming both mainline kernel tree): Kernel version Kernel config (try to use the defconfig) Firmware version Raspberry Pi model connected devices

JamesH65 commented 4 years ago

Please update to the latest kernel which may contain a fix for this issue.