Open MichaIng opened 4 years ago
I can reproduce this. The backtraces in kernel seemed pretty random to me, so probably a clock/voltage issue, rather than a kernel bug.
I can reproduce this.
You mean you "can" or you "can't" reproduce it? The issue is present with default clocks+voltage in my case, with only the minimum arm clock reduced and never ever any voltage warnings even when overclocked. We probably just found a second case with RPi Zero.
Probably related as well: https://www.raspberrypi.org/forums/viewtopic.php?p=1685668#p1685668
I can see the crash on a Pi3+. I couldn't provoke it on a Pi4. I added logging and last thing that occurred was a switch from ~300MHz to 1400MHz. Everything looked as expected (e.g. the core voltage was raised before the frequency). Possibly there is an issue with clocks/PLLs switching by large amounts (and perhaps overshooting), but that is just guessing currently.
Workaround for now is to disable the arm_freq_min. I'll let you know when it's safe to add back in.
Strange only that pre-5.4 the same large clocks jump was never an issue and with only lowest and highest clocks as only two pstates the jump was always the largest possible.
One could make a test with adjusting arm_freq
and arm_freq_min
so that only three as close as possible pstages are available. There seem to be fixed closks, 900, 600, 450, 360, 300, ?, 200, 150, so this should be easy. With this the intermediate pstages themselves could be ruled out as issue, which at least this was my very guess.
Had the same problem on a model 1b from 2013 that runs off a battery bank and solar. Don't remember why I had arm_freq_min set so low since it did not make a huge difference in power draw. When I updated yesturday to 5.4 and then rebooted everything seemed to be fine. Ran top and within about 30 seconds the screen froze. I then used the default config.txt and after a reboot there was no more freezing when cpu load increased. Was going to continue trouble shooting today but checked the github issues and bingo Michalng saved me some time, thankyou.
May I express the urgency I see in resolving or working around this bug? This has the potential to destroy systems by causing file corruption in unconditionally crashed services, e.g. databases and similar.
E.g. postinst
could remove/comment arm_freq_min
at least on affected systems (all but RPi4 it seems). There was a package upgrade yesterday but I don't see a hint that it has been fixed or worked around.
Let me know if there is anything I can test to help getting this resolved quickly.
We have a workaround in latest rpi-update firmware that will disallow arm_freq_min below 600. A fuller fix is being worked on but will need more work/testing so will wait until after the stable release is settled.
Kind of a noob here: is this fix available now (using apt-update/upgrade)?
No, you will need to use rpi-update. No schedule on apt.
Ok thanks
And remember that this is only a workaround for user which are not yet aware of the issue. In your case you simply arm_freq_min
to value of 600 or higher to have the exact same result (regarding this specific issue) π.
And remember that this is only a workaround for user which are not yet aware of the issue. In your case you simply
arm_freq_min
to value of 600 or higher to have the exact same result (regarding this specific issue) π.
So If i just comment out my arm_freq_min=300, it will go to the default 600 right? And I can change it back once a full update is out that fixes this? (I'm not using rpi-update, I'm waiting for apt, the warning scared me π)
So If i just comment out my arm_freq_min=300, it will go to the default 600 right?
Yes exactly. On RPi1+Zero it's 700 Mhz but all defaults work fine.
Maybe this is the wrong place to ask, but how do I know when a version is out that fixes this?
Subscribe to this issue, I'm sure we'll get a dev notice once a real fix is merged and I'll anyway keep an eye on it as well and search through release commits when I recognise them and will post here in case.
There is a proper fix for this as in internal PR. I'll let you know when it reached rpi-update.
Latest rpi-update firmware contains a fix for this issue that doesn't involve limiting arm_freq_min. Please update and test.
Jep seems to work fine. Just tested on RPi2 with arm_freq_min=150
which enables a lowest pstate of 200 MHz and all states are used, no hang or crash until now:
2020-08-25 23:05:25 root@micha:~# cat /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state
200000 36921
225000 152
257142 195
300000 225
360000 282
450000 307
600000 1073
900000 8511
200 MHz seems to be the lowest supported frequency, right?
EDIT: 100 MHz works as well, whether reasonable or not:
2020-09-16 13:27:28 root@micha:~# cat /sys/devices/system/cpu/cpufreq/policy0/stats/time_in_state
100000 200567
200000 2108
300000 1295
400000 854
500000 721
600000 1684
700000 799
800000 738
900000 5749
That's great! But as I understand, rpi-update is for pre-release stuff right? It will come to apt eventually?
Yes, rpi-update
by default loads the current master branch, compared to the stable branch that matches the apt packages. It would be great if you could test it as well, but only if you have a full SD card backup that you can recover, just in case any issues appear.
It will come to apt but will take a while as we've just finalised a stable apt version and a commit like this (which affects clocks/plls) could do with some time on the testing branch.
It will come to apt but will take a while as we've just finalised a stable apt version and a commit like this (which affects clocks/plls) could do with some time on the testing branch.
Ok cool. Totally get that, Iβm just happy itβs getting a fix! Iβll keep my eye on this thread for info on the apt release.
So Iβm seeing another update round of updates of the firmware in apt. Does this contain the fix? Thanks
No, only a single commit has been merged that has nothing to do with this issue: https://github.com/raspberrypi/firmware/commit/2b41f509710d99758a5b8efa88d95dd0e9169c0a Must have been an urgent one as well to justify a full firmware + kernel + bootloader upgrade for this single change π.
Okay not sure if it is related, but with newest kernel I get another hang/crash:
Actually I just wanted to switch governor back to performance, but that tasks hangs now as well. Service stops hang, every command that hangs cannot even be killed, the whole shell hangs, so I need to spawn a new shell (new SSH session/screen) to be able to investigate. Finally rebooted, which hangs as well π.
Applied performance
governor via reboot (power cycle), no issues since then. Will wait until tomorrow, then try to again remove arm_freq_min
. I was also playing with core_freq_min
and gpu_freq_min
, could those cause similar issues? Although kernel errors clearly refer to CPU scheduling.
I don't see any connection with that backtrace and this issue. You'll need to find a way of provoking this reliably. Then confirm if issue still occurs without arm_freq_min. Also confirm if this issue is new to a recent rpi-update kernel/firmware (e.g, report first version it occurs with and last version is didn't occur with)
Very latest kernel/firmware, to test the initial_turbo+performance governor solution. Just reverted min frequencies to defaults, booted with performance governor, which gives 900 MHz (RPi2) now reliable with initial_turbo
, then switched to schedutil
governor, which works fine until now:
2020-10-01 12:21:26 root@micha:~# cat /sys/devices/system/cpu/cpufreq/policy0/{scaling_governor,/stats/time_in_state}
schedutil
600000 127355
700000 892
800000 831
900000 31905
2020-10-01 12:21:36 root@micha:~# uname -a
Linux micha.gnoedi.org 5.4.68-v7+ #1343 SMP Mon Sep 28 12:38:29 BST 2020 armv7l GNU/Linux
I'll leave it for a while, then re-enable arm_freq_min=300
to see if that triggers the issues.
Definitely fine was #1336
, which I used with arm_freq_min=150
and schedutil
governor for a long time. To run tests with the other issue, I rpi-update'd to #1341
. Since it was to test performance governor and only very short sessions with over governors, I didn't recognise any issue there, but indeed I had a complete system crash a while later when bringing things back to production, leading to SD card corruptions which is why I flashed a new system from scratch with stable firmware packages. Due to much back and forth and testing I didn't think much about this, nor was I able to see any crash-related system log entries (analysing SD card on external system), but now I think it might be related to what I see just now after rpi-updating to #1343
.
So if I can replicate, #1336
- #1341
are the version to go through, but since the issue often took a while to become apparent, I'll let every version run for a longer time to assure it's working, so may take a while until I can narrow it down.
Okay I was able to successfully replicate a system crash multiple times by running a CPU intense task, mysqlcheck
worked well in this particular case to trigger the crash in two of three times. I had dmesg -w
running but sadly this time the crash broke SSH immediately, so I was not able to see any kernel errors. The symptoms are pretty much the same as before. Either tasks start to hang, producing this kind of random kernel errors, like rcu: INFO: rcu_sched self-detected stall on CPU
, tick_sched_handle
, tick_sched_timer
which is found in first post log and new logs as well, and a few other matches around CPU scheduling, or the system crashes completely with no chance of leaving any logs left, even persistent journal is empty.
This was #1343
+ schedutil
+ arm_freq_min=300
after arm_freq_min=600
worked very well the whole night and half day and arm_freq_min=300
+ performance
worked well the other half of the day.
#1343
+ schedutil
+ arm_freq_min=300
=> crashes easily when inducing some load or after a while#1343
+ performance
+ arm_freq_min=300
=> unbreakable#1343
+ schedutil
+ arm_freq_min=600
=> unbreakable#1337
+ schedutil
+ arm_freq_min=300
=> Wasn't able to break it first, but then it suddenly crashed on it's own when I was doing another thing.#1336
+ schedutil
+ arm_freq_min=300
=> WIP... Stress test passed, letting it run through the night and tomorrow, followed by a few more stress
and mysqlcheck
tests just to be sure.
@popcornmix
As of above testing, it is something between #1336
and #1337
that reintroduced the instability which was solved with #1336
. #1336
runs rock solid through a bunch of stress tests, including and excluding RAM, SD card and USB HDD disk writes, while #1337
breaks with pretty much identical symptoms as reported originally with this thread. Probably something related to the fix has been reverted accidentally? Based on my limited insights, https://github.com/raspberrypi/linux/pull/3815 and the related commits (not directly part of the PR) around CPUfreq seem to be the only changes that could have affected it.
@MichaIng is all your testing with a Pi2?
https://github.com/raspberrypi/linux/pull/3815 does enable additional arm frequency points on all Pi models.
I suspect prior to 3815 300MHz wasn't being used when idle (check with vcgencmd measure_clock arm
)
Is the only non-default setting in config.txt arm_freq_min=300 ?
RPi 2
, yes.300 MHz
was used before as well: https://github.com/raspberrypi/firmware/issues/1431#issuecomment-680270601
2020-10-02 14:25:06 root@micha:/var/log# echo powersave > /sys/devices/system/cpu/cpufreq/policy0/scaling_governor
2020-10-02 14:28:27 root@micha:/var/log# vcgencmd measure_clock arm
frequency(48)=300000000
schedutil
governor and vcgencmd measure_clock arm
since the command itself draws enough CPU time to have frequency raised to 600 MHz or 900 MHz already. But no reason to suspect it's being used, when scaling_available_frequencies
and stats show it being available and used and powersave
+ vcgencmd
does?2020-10-02 14:33:29 root@micha:/var/log# grep -Ev '^[[:blank:]]*(#|$)' /boot/config.txt
hdmi_ignore_hotplug=1
framebuffer_width=16
framebuffer_height=16
max_framebuffer_width=16
max_framebuffer_height=16
framebuffer_depth=8
disable_overscan=1
gpu_mem_256=16
gpu_mem_512=16
gpu_mem_1024=16
disable_splash=1
enable_uart=0
temp_soft_limit=50
temp_limit=65
initial_turbo=20
over_voltage=-2
arm_freq=900
gpu_freq=100
core_freq=450
sdram_freq=450
dtparam=sd_overclock=100
arm_freq_min=300
dtoverlay=micha
force_eeprom_read=0
over_voltage=-2
and gpu_freq=100
were commented before without preventing the crash on #1343
and #1337
.#1336
without causing any issues.dtoverlay=micha
disables vcsm
which fails to start and causes kernel errors on boot anyway with gpu_mem=16
on completely headless device.arm_freq_min=300
once and test it again with #1337
or #1343
. Any preference?You should remove all overclock settings for a fair test. over-clocking/under-volting comes with no guarantees.
Test with just arm_freq_min=300
and nothing else. I think you'll probably still see the issue, but the other settings are muddying the water.
I have been poking around a Pi1 and have seen a hang. It looks like rapidly switching powersave/performance governors (300MHz<->700MHz) is safe. I have found switching from 500MHz to 700MHz can fail (and that can happen with ondemand governor after https://github.com/raspberrypi/linux/pull/3815
arm_freq_min=500 may prove to be a more reliable setting for provoking failure (at least on Pi1).
Little addition, in case it counts, it's Raspberry Pi 2 Model B Rev 1.1
, i.e. BCM2709
.
Okay I'll do some cleaner tests. Interesting that it is only on certain frequency jumps. A way to test specific frequency jumps repeatedly without changing arm_freq_min
:
cd /sys/devices/system/cpu/cpufreq/policy0
echo userspace > scaling_governor
while :; do for i in 500000 700000; do echo $i > scaling_setspeed; done; done
or put a short sleep inside. I'll play around with some methods.
Looks like you can increase scaling_min_freq
and decrease scaling_max_freq
, but it considers the initial settings as the absolute limits.
root@domnfs:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_min_freq
600000
root@domnfs:/sys/devices/system/cpu/cpufreq/policy0# echo 700000 > scaling_min_freq
root@domnfs:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_min_freq
700000
root@domnfs:/sys/devices/system/cpu/cpufreq/policy0# echo 300000 > scaling_min_freq
root@domnfs:/sys/devices/system/cpu/cpufreq/policy0# cat scaling_min_freq
600000
The numbers are in kHz. You should be using 700000 rather than 700.
How was I able to overlook that π€£, I removed/commented related parts from my posts to not mix/confuse with the actual issue this is about. scaling_min_freq
+ scaling_max_freq
are working pretty fine as expected and might serve another way to avoid reboot for changing arm_freq_min
.
I found some time for testing, reverted to kernel #1343
and config.txt
containing arm_freq_min=300
only, nothing else. Applied userspace
and ran something like this for a while:
while :; do for i in 500000 700000; do echo $i > scaling_setspeed; done; done
With different frequency combinations, which and without adding some sleep
after each change and verified on a different terminal that indeed the frequency is changing (not too fast or such) via vcgencmd measure_clock arm
and scaling_cur_frequency
, but so far I couldn't trigger any error π€. I as well tried to run some stress
in parallel.
I'm going through a few more combinations, but probably manual frequency changes can simply not trigger it by CPUFreq induced changes via governor only?
Finally will assure that I'm able to trigger the crash again via schedutil
.
I'm still seeing this against 5.4.69-v8 (with associated latest rpi-firmware)
Hashes:
[ 484.329877] Workqueue: events_freezable mmc_rescan
[ 484.331478] Call trace:
[ 484.333067] __switch_to+0x110/0x1d0
[ 484.334669] __schedule+0x2f4/0x750
[ 484.336254] schedule+0x44/0xe0
[ 484.337836] __mmc_claim_host+0xb8/0x210
[ 484.339889] mmc_get_card+0x38/0x50
[ 484.343794] mmc_sd_detect+0x24/0x90
[ 484.347691] mmc_rescan+0xc8/0x390
[ 484.351593] process_one_work+0x1c0/0x470
[ 484.355503] worker_thread+0x50/0x430
[ 484.358914] kthread+0x11c/0x150
[ 484.362033] ret_from_fork+0x10/0x20
This particular issue is so bad, I can't even use the Pi4 at current.
With schedutil + arm_freq_min=600 + no overclocking or overvolting options.
With performance + arm_freq_min=600 + no overclocking.
Which model of Raspberry Pi? Raspberry Pi 4 Model B Rev 1.1 Which OS and version (cat /etc/rpi-issue)? Buildroot Which firmware version (vcgencmd version)? version 4672d0274057d726f3a327e2b3fe76f831b811bb (clean) (release) (start_x) Which kernel version (uname -a)? Linux pi4-2 5.4.69-v8
I also have set these sysctl values:
vm.dirty_background_ratio = 5 vm.dirty_ratio = 10
With performance + arm_freq_min=700 + no overclocking: still seeing the issue.
With performance + no custom config.txt options + "stable" rpi-firmware: version 4439d2aaa6c376a2d1ef4402f142e1cf4de37c43 (clean) (release) (start_x): same issue with mmc_rescan errors.
Maybe my SD card is just shot? Will retry a new one.
@paralin does your problem occur with no custom arm_freq_min setting in config.txt? If it doesn't then removing that setting is recommended. It it does then your problem is unrelated to this issue.
@popcornmix even with defaults:
[ 363.486818] INFO: task kworker/3:1:58 blocked for more than 120 seconds.
[ 363.488636] Tainted: G C 5.4.69-v8 #11
[ 363.490429] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 363.492282] kworker/3:1 D 0 58 2 0x00000028
[ 363.494147] Workqueue: events_freezable mmc_rescan
[ 363.496002] Call trace:
[ 363.497846] __switch_to+0x110/0x1d0
[ 363.499697] __schedule+0x2f4/0x750
[ 363.501536] schedule+0x44/0xe0
[ 363.503386] __mmc_claim_host+0xb8/0x210
[ 363.505241] mmc_get_card+0x38/0x50
[ 363.507091] mmc_sd_detect+0x24/0x90
[ 363.508936] mmc_rescan+0xc8/0x390
[ 363.510759] process_one_work+0x1c0/0x470
[ 363.512584] worker_thread+0x50/0x430
[ 363.514388] kthread+0x11c/0x150
[ 363.516185] ret_from_fork+0x10/0x20
testing on a second host w/ different brand SD card right now, to check if I see the same issue.
Edit: OK with a second pi4 and a brand new SD card but identical software I'm not seeing the issue (latest firmware). Will continue to stress it and see if it pops up. Possibly the errors I'm seeing are just a broken SD card.
@paralin Looks like a different issue then. Also I don't see any matches (aside of the blocked task, which is an unspecific symptom of the underlying issue) between my kernel stack traces and ours. I'm not good in reading it but mine contains a lot words related to CPU and scheduling and yours seem to be an issue with the SD card.
I not have the "issue" that I cannot replicate the issue until now with schedutil scheduler and 300 MHz minimal frequency. After giving up to break it with the methods that worked reliable before, I now switched to "production" (full SD card backup created π
) and still nothing for 1.5 days. Will now re-add the config.txt
settings I used before one-by-one (moreless) and see if/which one re-adds the instability. I am pretty sure it ruled out the lowered voltage (-2
) before, but maybe I made a mistake during all that back and forth. Would be said of the new scheduling pstates would require a higher voltage for some reason π’.
Okay I think I found the reason for the recent instability. Until crash, I see:
On every error it clocks down MMC until default 50 MHz, but the errors go on, e.g. on larger file writes, and even without SD overclocking it occurs. Extremely strange was the fact that it even happens with scaling governor conservative
(or any other non-static) and scaling_min_frequency
= arm_freq
= 900 MHz, hence static frequency, confirmed via time_in_state
and vcgencmd
.
As fast as scaling governor is switched to powersave
or performance
, all works stable, as fast as arm_freq_min
is not set (default 600 MHz), all works stable as well. I found a way very reliably trigger the above error followed by hanging until crashing session.
I'm a bid out of ideas why only the setting itself has an effect and why a non-static governor causes reproducible issues even stat the frequency does not change at all.
But I finally remembered that I blacklisted uio_pdrv_genirq
in an effort to find out what is required for what and that did not cause any issues the last year or what. But since it has something to do with IRQ handling and the errors was an interrupt timeout, I enabled the module, and viola, was unable to trigger the error anymore.
No idea what changed and still a bunch of open questions as of above tests, but I think there is a good reason why the module is enabled by default. I'll continue to to test and verify the results but it seems stable now.
Now on 5.4.72-v7+ #1356
btw.
I tried the lower min frequencies, 100 MHz and 200 MHz which again lead to complete crashes within a few minutes without any special trigger and without any kernel or syslog error that I would have been able to see after reboot.
I raised min frequency to the previously stable 300 MHz (pre-5.4) and while it was stable for a longer time, during nightly backups I got the following error followed by hanging rsync
, not able to kill it, even SIGKILL has no effect, similar to the issues at the very beginning of this report, but the error message is quite different now during a file write:
I'm trying to rule out some other factors and am quite puzzled how all of this belongs together (or not). All I can say is that untouched arm_freq_min
or performance
both load to a rock-stable system, regardless of kernel modules and under-voltage (I usually reduce to -2
, min and max).
I saw there was another release. Dit this fix this issue?
@vmachiel
Yes if you do not lower your over_voltage
it seems stable now. As I'm customising my system quite much, for testing and lowering load/power usage to an absolut minimum (within the borders of official kernel), I'm still facing some instabilities as below:
I have a stable system with even arm_freq_min=100
, for which over_voltage
must not be lowered. The same is true for any other lowered arm_freq_min
on my RPi 2, while this worked well with kernel 4.19 and 4.14, with arm_freq_min=300
and over_voltage=-2
. I'm currently testing if over_voltage_min=-2
works, which anyway makes much more sense when lowering the minimum frequency.
EDIT: over_voltage_min=-2
+ arm_freq_min=100
works stable.
This is basically fine, as lowering voltage is always a trial & failure topic, as each individual SoC behaves slightly different and I assume other hardware states affect it as well. However, that intermediate frequency scaling requires a higher voltage is a bid said in regards to power savings and lowering temperature, as I was hoping for an overall lowered power consumption with intermediate frequencies compared to the 1 step jumps from min to max before. But not sure if it's worth to investigate this further.
Additionally uio_pdrv_genirq
kernel module needs to be active now (it is loaded by default), else I get the mmc interrupt timeouts which can be triggered easily by saving/writing some larger text/config files. I also never faced this the last year on 4.19 where I had this module blacklisted to test if/where its needed for π
. Using a static CPU governor solves it as well somehow. I'll again run a test with this isolated with default arm_freq_min
, voltage etc., as I believe it is also only an issue in combination with lowered voltage.
Thanks, let me know!
@popcornmix Do you actually want to further investigate this issue? From my point of view, there is still something not as it should be, but I have limited insights to know what may effect what. I tested a few other combinations, and a few lead to errors or crashes quickly (within an hour, sometimes easily to trigger quickly with some load) while the others run stable for days, including stress tests, and all of them ran stable for months on Linux 4.19.X. Notably I narrowed down everything to the following cases, all with a dynamic CPUFreq governor, of course:
over_voltage=0
+ arm_freq_min=100
=> stableover_voltage=-2
+ arm_freq_min=600
(or commented) => stableover_voltage=-2
+ arm_freq_min=500
=> hangs or crashes quickly, with different errors, either producing kernel errors + hangs, which can be solved by raising the scaling_min_frequency
or applying any static CPU governor (powersave
, performance
, userspace
), or crashing the system completely without any kernel error, where I used dmesg -w
on a dedicated SSH session + persistent journald
logs.
_I think the uio_pdrv_genirq
I had in view before does not really have an effect, but loading it might have reloaded something else around kernel/IRQ, just shifting the symptoms of the underlying issue, or so. As far as I understand, that kernel driver is either used actively by another module or service, or not, and in my case it is simply not used._Since over_voltage
has no effect on the voltage when the lowest frequency (e.g. 500 MHz in 3. case) is currently applied, it should not have an effect, at least the voltage + frequency combinations in 2. and 3. case above are exactly the same. So probably it has something to do with the way/order/timing in which voltage and frequency are adjusted, e.g. the frequency is raised quickly while the voltage is raised later, leaving a short time frame with a too low voltage for the current frequency. But in above case, since over_voltage_min
is the same, the 1. case then should have the same issue. So probably it is then about a short voltage peak demand when doing a certain larger step and while 1. case allows that (1.35V), 2. does not required it (peak demand lower) and 3. case frequency switch requires higher voltage for a short time but that is capped to insufficient 1.3V. That is all I can imagine to explain the results π.
However, I can understand when this is not followed further since a stable over_voltage
is something that needs to be found/tested on every system individually anyway and the default value works stable, at least on all my tests. While I'm not happy with that situation, say a word and I'll stop investing time and keeping this issue up here π.
I just recognised https://github.com/raspberrypi/firmware/commit/bff705fffe59ad3eea33999beb29c3f26408de40. Needed to learn what a VCO is π. I'll try it ASAP.
Meanwhile my Pi runs stable with over_voltage=-1
+ arm_freq_min=100
+ over_voltage_min=-2
+ schedutil
governor, let's see if over_voltage=-2
is possible again with the new kernel.
Hmm, in my case arm_freq_min
is now ignored, at least with the previously tested minimum:
2020-11-23 21:36:37 root@micha:~# grep arm_freq_min /boot/config.txt
arm_freq_min=100
2020-11-23 21:36:48 root@micha:~# vcgencmd get_config arm_freq_min
arm_freq_min=600
2020-11-23 21:37:01 root@micha:~# uname -a
Linux micha.gnoedi.org 5.4.79-v7+ #1373 SMP Mon Nov 23 13:22:33 GMT 2020 armv7l GNU/Linux
Same with arm_freq_min=300
and removed over_voltage
settings.
Is this issue still being worked on? I have recently purchased a Pi Zero (non-wifi), running DietPi 7.0 with kernel Linux DietPi 5.10.17+ #1403 Mon Feb 22 11:26:13 GMT 2021 armv6l GNU/Linux. Setting arm_freq_min seems to be ignored (tried ondemand & conservative govenor) and is always set to 700-1000mhz as Michalng reports above.
We are also still waiting for the resolution of this issue.
We're using rpi2 v1.2 and used to change this in config.txt: arm_freq_min=350 core_freq_min=150 temp_limit=70
dtparam=i2c_arm=on dtparam=spi=on
dtoverlay=i2c-rtc,ds3231 dtoverlay=sc16is752_0
enable_uart=1
No over/undervoltage and no other changes in config.txt Standard governor, never touched this settings. Normally used in headless mode, boot in console not logged in. No desktop loaded at startup.
Kernel 4.9.35-v7+ # 1014 works as expected, so we know the hardware part is able to manage it.
We started having problems when we upgraded to the kernel 4.19.80-v7+ # 1275 and never resolved also in kernel 5.x The latest kernel simply ignores the arm_freq_min parameter and this results in increasing the cpu temperature by about 3 to 5Β° C during normal operation with no load on the cpu (vcgencmd measure_temp arm) Unfortunately this also leads to a much faster rise in CPU temperature as the load rises.
If can help, on kernel 4.19.80 there is a temperature range that cause more often the issue and the reboot phase, the reboot phase high the cpu temperature some degree more. To reproduce we usually stress the cpu to a temperature range from 55 to 65Β°C and then launch a reboot command. Most of the time the shutdown process is ok, but it is not able to boot up again, it stops before the RGB splash screen. Sometime the red and the green led are fixed on, sometime only the red led is on. I also checked all the possible outputs on uart by enabling with the command sed -i -e" s / BOOT_UART = 0 / BOOT_UART = 1 / "bootcode.bin", but I have never seen a useful message.
It seems that during the boot phase the kernel reads the frequency and temperature setting from the config.txt file, since it is already hot then it start throttling the cpu accordingly, but this break the boot process (cpu, sd card reader or something else).
In conclusion: We need to lower the idle frequency down to 350Mhz, if possible even less. Is there an ETA for solving this issue?
Thank you.
arm_freq_max
is ineffective expectedly, arm_freq
is the maximum frequency already.
The Raspberry Pi Zero has a default of 1000 MHz, so if that causes issues, e.g. when you simply remove or comment the two lines and it fails to reboot, then there seems to be an issue with the hardware. But that is not related to the arm_freq_min
topic, this issue is about π.
Describe the bug I was upgrading to the newest firmware + kernel packages, which resulted in system hangs and/or crashes. I narrowed down the issue to
arm_freq_min
which I lowered to150
or300
(tested both) to allow the system clocking below 600 Mhz. Commenting the setting leads to a stable system, setting/reducing it leads to a quickly hanging or crashing system.To reproduce
Raspberry Pi 2 Model B Rev 1.1
to current package release5.4.51-v7+
.arm_freq_min
to300
(gpu_mem=16
, if relevant)vcgencmd measure_clock gpu
.Expected behaviour Add a clear and concise description of what you expected to happen.
Actual behaviour Setting
arm_freq_min
to300
should not lead to system crashes.System Copy and paste the results of the raspinfo command in to this section. Alternatively, copy and paste a pastebin link, or add answers to the following questions:
Raspberry Pi 2 Model B Rev 1.1
cat /etc/rpi-issue
)?Raspbian GNU/Linux bullseye/sid
vcgencmd version
)?version 21a15cb094f41c7506ad65d2cb9b29c550693057 (clean) (release) (start_cd)
uname -a
)?Logs
Additional context
This is new and probably the reason for the crashes when lowering minimum frequency. When leaving at 600, there are only two pstates 600 and 900 and with kernel 4.19 there are always only two. I was actually hoping for that feature, so great work, however sadly at least my RPi model does not work fine with it.