Open pb66 opened 3 years ago
Your expectation is correct - if the mini-UART (UART1) is enabled in Device Tree, e.g. with enable_uart=1
then the core clock should be at a fixed frequency in order to guarantee a stable baud rate.
Aside from a regression in clock management (not demonstrated, but not ruled out), another possibility is that the CPU frequency is being capped due to over-temperature or under-voltage. Can you try the following?:
vcgencnd get_throttled
after the corruption has been seen.vcgencmd measure_clock core
repeatedly (with sleeps if in a loop) when the Pi is busy (while true; do true; done &
should be enough) and when idle, and report your observations.Looking back at the email notification which included your kernel log I see:
[ 47.514784] Under-voltage detected! (0x00050005)
Perhaps this is a factor in the problem.
Thanks for the speedy reply. I will try these tests later and report back.
A little more info on this application, this Pi is monitoring a few things and never breaks a sweat, it just ticks over 24/7 with a pretty steady load, the only real exception to that is when i ssh in for a bit of tinkering, updating the os or pull in some changes from Github. It doesn't even have a server or database, it just forwards data elsewhere.
I won't entirely rule out PSU (undervoltage) but I do prefer higher quality over sized PSU's although i haven't tested this particular PSU recently, however, the "rfm2pi" board I have on the GPIO has an rfm module on it and these rfm2pi's are known to have issues with the rfm module browning out at the slightest wisp of a low voltage and that's powered from the 3v GPIO, it only receives not transmits so the load it presents on the RPi is pretty negligible and also constant. I have to say I see the under voltage warnings occur on even the most stable of setups, but that doesn't mean they can be ignored of course, I just wasn't surprised or alarmed by their presence. With the rfm module happily working throughout I am pretty confident the supply is fine, but will of course investigate and confirm.
Report the output of vcgencnd get_throttled after the corruption has been seen.
This is a permanent state, it doesn't come and go, the serial is garbage until I set the core_freq and reboot, likewise it works perfectly until I remove that setting again.
when the Pi is busy
It's never busy, are you asking me to artificially load it up for a test or for the test to be done at it's busiest (ie anytime) ?
Ok so I've done some remote tests, I will have to wait til I'm there to do the PSU and alternative sd image tests, maybe this evening or tomorrow.
Firstly, I have to declare I was wrong about the loading, one of the things monitored was offline causing the script to retry and increase the load significantly. I was made aware of this by the fact the clock core is actually increasing to 400MHz rather than being throttled when I remove core_freq=250
and reboot.
Just a few seconds without core_freq=250
pi@RedPi:~ $ while true; do echo $(vcgencmd measure_clock core); sleep 1; done
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=400000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=400000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=399999000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=399999000
that led me to check top
to see why it was increasing and found a service hogging all the cpu, due to the endpoint being offline, So whilst this may change the info I've given, I still would expect that core_freq to be fixed at 250MHz due to the enable_uart=1
and I suspect the under-voltage to be less of a concern if it is not throttling, but i will still do some checks as I have confirmed that vcgencnd get_throttled
reports 0x5000
immediately after booting, so I will recheck once I check the voltages manually, swap out the PSU and sort/stop the over working service. I also notice the wifi is alittle sketchy today which maybe adding some load perhaps.
With core_freq=250
the same test output is static at frequency(1)=250000000
.
It would be good to confirm which firmware is running - vcgencmd version
. One of the not-so-recent changes had the effect of changing the "locked" frequency to the upper limit (cpu_freq
) rather than the lower limit (cpu_freq_min
). This is good for performance, but does mean that throttling will change the frequency.
pi@RedPi:~ $ vcgencmd version
Jan 27 2021 22:26:53
Copyright (c) 2012 Broadcom
version 99d9a48302e4553cff3688692bb7e9ac760a03fa (clean) (release) (start)
but does mean that throttling will change the frequency.
Surely throttling would mean a downward change? If "fixed" at 250MHz and then gets overruled by throttling that would be lower than 250MHz rather than 400MHz, no? Or are you saying the incorrect "locked freq" from enable_uart=1
when NOT using core_freq=250
is 400MHz and that is then being throttled TO 250MHz?
Just a brief note to say I stopped the service hogging the cpu and rechecked the core freq and with just enable_uart=1
(not core_freq=250
) it's a steady(ish) 250MHz
pi@RedPi:~ $ while true; do echo $(vcgencmd measure_clock core); sleep 1; done
frequency(1)=250000000
frequency(1)=249999000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
frequency(1)=250000000
and the serial comms via the Mini UART seem stable.
So I can no longer be sure this is definitely a new bug as it would appear to have only surfaced due to the abnormally high load put on the cpu by the demanding service.
I have noticed that even in this low load test there are a fair few Typical! Just spotted one.frequency(1)=249999000
appearances that do not ever seem to appear when core_freq=250
is set, in practice I know this will not bee an issue but it does suggest, even under light loading, the core freq isn't being fixed by enable_uart=1
.
I will hopefully get to do the physical checks later today.
Having had a chance to test this I find it to be behaving as I would expect, which is:
core_freq_min
), but that is a bit limiting on some systems so for the last year or so it has locked to the maximum (core_freq
) instead.vcgencmd measure_clock core
(e.g. between 250000000 and 249999000) is just a sampling artifact. measure_clock
does exactly that, attaching a counter to the specified clock, running it for 1 millisecond and multiplying the result by 1000; changes in the answer of +/-1000 depending on the phase of the chosen clock and the precise time of the count are to be expected.The change from min to max frequency locking has performance advantages, but with a few potential drawbacks:
Although I consider throttling to be an error condition - the Pi should have more power or better cooling - and that some data loss is acceptable, it would be better if the UART clock divisor could be adjusted to allow some control.
Having said that, the question in your case is why the frequency is hovering at 250MHz rather than 400MHz. What do the following commands return:
$ grep . /sys/devices/system/cpu/cpufreq/policy0/*
$ grep -v -E "^(#.*)?$" /boot/config.txt
$ vcgencmd measure_temp
$ vcgencmd measure_volts
$ dmesg | grep throttle
Is this the right place for my bug report? I hope so, apologies if that's not the case. I've also posted on the RPi forum (See https://www.raspberrypi.org/forums/viewtopic.php?f=28&t=302771)
Describe the bug
Mini UART baud unstable or incorrect when
enable_uart=1
is set without also settingcore_freq=250
, so I suspect settingenable_uart=1
does NOT currently fix the clock speed.To reproduce With vanilla raspiOS, set
enable_uart=1
in config.txt, edit cmdline.txt to removeconsole=serial0,115200
and reboot.Use serial terminal (eg minicom) to attempt to use Mini UART and primary UART with address
/dev/serial0
at the correct baud for test signal device connected to the GPIO (signal needs to be easily readable or at least easly distinguised from serial garbage).If garbage output is all you get, set
core_freq=250
in config.txt and reboot.Try test with minicom again, this time should be successful.
Expected behaviour
core_freq
should be fixed to 250MHz whenenable_uart=1
set and therefore the Mini UART baud should be constant/correct, there shouldn't be a need to explicitly setcore_freq
according to documentation and previous experience. (see https://www.raspberrypi.org/documentation/configuration/uart.md)Actual behaviour Mini UART baud unstable unless
core_freq=250
explicitly set.System
raspinfo output
pi@RedPi:~ $ raspinfo System Information ------------------ Raspberry Pi 3 Model B Rev 1.2 PRETTY_NAME="Raspbian GNU/Linux 10 (buster)" NAME="Raspbian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" Raspberry Pi reference 2020-12-02 Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, cce27bd6f44a3b2e83855645986b3e21f771e852, stage4 Linux RedPi 5.10.11-v7+ #1399 SMP Thu Jan 28 12:06:05 GMT 2021 armv7l GNU/Linux Revision : a02082 Serial : 00000000aa5fd460 Model : Raspberry Pi 3 Model B Rev 1.2 Throttled flag : throttled=0x50000 Camera : supported=0 detected=0 Videocore information --------------------- Jan 27 2021 22:26:53 Copyright (c) 2012 Broadcom version 99d9a48302e4553cff3688692bb7e9ac760a03fa (clean) (release) (start) alloc failures: 0 compactions: 0 legacy block fails: 0 Filesystem information ---------------------- Filesystem 1K-blocks Used Available Use% Mounted on /dev/root 7312680 3432020 3530624 50% / devtmpfs 439916 0 439916 0% /dev tmpfs 473196 0 473196 0% /dev/shm tmpfs 473196 6404 466792 2% /run tmpfs 5120 4 5116 1% /run/lock tmpfs 473196 0 473196 0% /sys/fs/cgroup /dev/mmcblk0p1 258095 48682 209413 19% /boot log2ram 102400 51480 50920 51% /var/log tmpfs 94636 4 94632 1% /run/user/1000 Filename Type Size Used Priority /var/swap file 102396 0 -2 Package version information --------------------------- raspberrypi-ui-mods: Installed: 1.20201210+nmu1 raspberrypi-sys-mods: Installed: 20210125 openbox: Installed: 3.6.1-8+rpt5 lxpanel: Installed: 0.10.0-2+rpt14 pcmanfm: Installed: 1.3.1-1+rpt25 rpd-plym-splash: Installed: 0.26 Networking Information ---------------------- eth0: flags=4099Additional context
I have a simple ("rfm2pi") device mounted on the first 2x5 GPIO pins that periodically transmits a data frame out to the primary UART at a baud of 38400.
Appologies if this is a planned change and just a (delayed) documentation issue.
Having now seen issue https://github.com/raspberrypi/documentation/issues/1614 I will (when I get a chance) try this again from scratch as, perhaps coincidentally, this particular sd card image was initially booted on a RPI 4 before being moved to a RPI 3, but that could well be of no consequence, but I will try to eliminate/confirm that if anyone thinks it may have impacted some config. I understood raspiOS images to be interchangable across models, but with the new (to me) revelation of an alternative fixed 500MHz core_freq on a Pi 4, I wonder . .
Thanks in advance.