agherzan / meta-raspberrypi

Yocto/OE BSP layer for the Raspberry Pi boards
https://www.yoctoproject.org/
MIT License
539 stars 415 forks source link

[dunfell] Sporadic freeze due to MMC issues #1088

Open sahib opened 2 years ago

sahib commented 2 years ago

Hello,

I'm currently debugging a nasty freeze on a CM3+ with a 4.19 kernel and the dunfell branch of this repo and I hope to find someone pushing me in the right direction. In my case the freeze symptoms look like this:

System information

System info: ```bash $ vcgencmd version # also newer/older firmware version was tried without any change. May 27 2021 14:04:13 Copyright (c) 2012 Broadcom version 7d9a298cda813f747b51fe17e1e417e7bf5ca94d (clean) (release) (start) $ uname -a Linux 4.19.127-v7 #1 SMP Thu Jul 28 07:59:25 UTC 2022 armv7l GNU/Linux $ cat /proc/cmdline 8250.nr_uarts=1 bcm2708_fb.fbwidth=480 bcm2708_fb.fbheight=800 bcm2708_fb.fbswap=1 dwc_otg.lpm_enable=0 usbhid.mousepoll=0 vc_mem.mem_base=0x3ec00000 vc_mem.mem_size=0x40000000 cma=512M@128M coherent_pool=6M fbcon=vc:2-4 logo.nologo quiet video=HDMI-A-1:480x800MR-24@60 ostree=/ostree/boot.1/poky/2fb5f4d11ab0a08ad01ad2b44fee499ee3d129c276dd0f41184fdc79850e413e/0 ostree_root=/dev/mmcblk0p2 root=/dev/ram0 rw rootwait rootdelay=2 ramdisk_size=8192 panic=1 $ cat config.txt disable_overscan=1 gpu_mem=128 boot_delay=0 boot_delay_ms=0 disable_splash=1 dispmanx_offline=1 dtparam=i2c1=on dtparam=i2c_arm=on enable_uart=1 dtoverlay=vc4-kms-v3d hdmi_edid_file=1 avoid_warnings=2 mask_gpu_interrupt0=0x400 dtparam=audio=on ```

Anything similar on the internet?

Yes. There are plenty of reports of this behavior on the internet (google: "mmc_rescan freeze"). Here's a small selection:

Summary: What all of those have sadly in common: There is no clear fix. Some had luck with up- or downgrading their kernel, some with playing around with sysctl vm.{swappiness,dirty_ratio,min_free_kbytes}. It also happens from Pi2 up to Pi4. In my case it's a CM3+, so likely a common (software?) issue and none of the proposed fixes seem to work.

Debugging attempts:

When disabling swap, the system seems a little more "stable" during the freeze. SSH connections started before the freeze seem to still run if they do not use any I/O. For example this command in a while loop reported that the memory usage of the system is quite okay:

cat /proc/meminfo | grep '\(MemAvailable\|MemFree\)' | tr -d '\n' | awk '{ print $2 / 1024 " " $4 / 1024 }'

However, if a program that is executed that needs to be read from disk or needs to write everything blocks. Listing files partly works, likely due to the page cache. This situation also happens for a short time when writing with high throughput (as with dd if=/dev/zero of=/var/file bs=1M count=100000), but the system normally recovers after. Therefore it seems that the attached mmc is not just slow, but not working at all. Judging from dmesg even journald seems to be stuck and refuses to be killed (which is, afaik, a clear sign that the process is stuck in kernel code):

dmesg output: The exact stack traces vary from time to time, but the one with ``mmc_rescan`` is common to all of them. The other ones seem to depend on what kind of I/O syscalls were done during the freeze. Also note, this is one freeze with swap enabled (therefore systemd itself was stuck). ``` [ 17.054909] ax88179_178a 1-1.1:1.0 eth0: ax88179 - Link status is: 1 [ 17.068055] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 20.441818] systemd-journald[191]: Successfully sent stream file descriptor to service manager. [ 40.861556] systemd-journald[191]: Successfully sent stream file descriptor to service manager. [ 186.769128] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=215 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 186.794955] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=216 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 192.018281] systemd[1]: systemd-journald.service: State 'stop-watchdog' timed out. Terminating. [ 192.027459] systemd[1]: systemd-journald.service: Changed stop-watchdog -> stop-sigterm [ 192.035886] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=217 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 192.061699] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=218 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 192.087320] systemd[1]: systemd-logind.service: Got notification message from PID 337 (WATCHDOG=1) [ 197.268262] systemd[1]: systemd-journald.service: State 'stop-sigterm' timed out. Killing. [ 197.276714] systemd[1]: systemd-journald.service: Killing process 191 (systemd-journal) with signal SIGKILL. [ 197.286855] systemd[1]: systemd-journald.service: Changed stop-sigterm -> stop-sigkill [ 197.295132] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=219 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 197.320632] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=220 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.518263] systemd[1]: systemd-journald.service: Processes still around after SIGKILL. Ignoring. [ 202.527427] systemd[1]: systemd-journald.service: Changed stop-sigkill -> final-sigterm [ 202.535491] systemd[1]: systemd-journald.socket: Changed running -> listening [ 202.542684] systemd[1]: systemd-journald-dev-log.socket: Changed running -> listening [ 202.550874] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2ddev_2dlog_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=221 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.577322] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2ddev_2dlog_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=222 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.603830] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=223 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.629174] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=224 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.654596] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=225 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.680057] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=226 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.705366] systemd[1]: systemd-journald.socket: Incoming traffic [ 202.711490] systemd[1]: systemd-journald.service: Trying to enqueue job systemd-journald.service/start/replace [ 202.721529] systemd[1]: Added job systemd-journald.service/start to transaction. [ 202.728939] systemd[1]: Pulling in system.slice/start from systemd-journald.service/start [ 202.737129] systemd[1]: Added job system.slice/start to transaction. [ 202.743486] systemd[1]: Pulling in -.slice/start from system.slice/start [ 202.750225] systemd[1]: Added job -.slice/start to transaction. [ 202.756149] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/start [ 202.765304] systemd[1]: Added job systemd-journald.socket/start to transaction. [ 202.772626] systemd[1]: Pulling in system.slice/start from systemd-journald.socket/start [ 202.780732] systemd[1]: Pulling in -.mount/start from systemd-journald.socket/start [ 202.788400] systemd[1]: Added job -.mount/start to transaction. [ 202.794322] systemd[1]: Pulling in system.slice/start from -.mount/start [ 202.801039] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/start [ 202.810184] systemd[1]: Pulling in systemd-journald-dev-log.socket/start from systemd-journald.service/start [ 202.820026] systemd[1]: Added job systemd-journald-dev-log.socket/start to transaction. [ 202.828043] systemd[1]: Pulling in system.slice/start from systemd-journald-dev-log.socket/start [ 202.836840] systemd[1]: Pulling in -.mount/start from systemd-journald-dev-log.socket/start [ 202.845194] systemd[1]: Pulling in systemd-journald-audit.socket/start from systemd-journald.service/start [ 202.854861] systemd[1]: Added job systemd-journald-audit.socket/start to transaction. [ 202.862705] systemd[1]: Pulling in system.slice/start from systemd-journald-audit.socket/start [ 202.871349] systemd[1]: Found redundant job -.slice/start, dropping from transaction. [ 202.879199] systemd[1]: Found redundant job system.slice/start, dropping from transaction. [ 202.887484] systemd[1]: Found redundant job -.mount/start, dropping from transaction. [ 202.895318] systemd[1]: Found redundant job systemd-journald-dev-log.socket/start, dropping from transaction. [ 202.905246] systemd[1]: Found redundant job systemd-journald.socket/start, dropping from transaction. [ 202.914482] systemd[1]: Keeping job systemd-journald-audit.socket/start because of systemd-journald.service/start [ 202.924777] systemd[1]: systemd-journald.service: Installed new job systemd-journald.service/start as 478 [ 202.934368] systemd[1]: systemd-journald-audit.socket: Installed new job systemd-journald-audit.socket/start as 484 [ 202.944867] systemd[1]: systemd-journald.service: Enqueued job systemd-journald.service/start as 478 [ 202.954030] systemd[1]: systemd-journald.socket: Changed listening -> running [ 202.961468] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=227 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 202.986824] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=228 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.012219] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=229 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.038264] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=230 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.064388] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=231 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.089829] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=232 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.115131] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=233 reply_cookie=0 signature=uos error-name=n/a error-message=n/a [ 203.136152] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=234 reply_cookie=0 signature=uos error-name=n/a error-message=n/a [ 203.157165] systemd[1]: systemd-journald-dev-log.socket: Incoming traffic [ 203.163972] systemd[1]: systemd-journald-dev-log.socket: Changed listening -> running [ 203.172116] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2ddev_2dlog_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=235 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.198519] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2ddev_2dlog_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=236 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.225078] systemd[1]: systemd-journald-audit.socket: ConditionCapability=CAP_AUDIT_READ succeeded. [ 203.234276] systemd[1]: systemd-journald-audit.socket: ConditionSecurity=audit failed. [ 203.242210] systemd[1]: systemd-journald-audit.socket: Starting requested but condition failed. Not starting unit. [ 203.252576] systemd[1]: systemd-journald-audit.socket: Job 484 systemd-journald-audit.socket/start finished, result=done [ 203.263459] systemd[1]: Condition check resulted in Journal Audit Socket being skipped. [ 203.271745] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=237 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.297830] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=238 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.323733] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=239 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a [ 203.345497] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=240 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.371016] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=241 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.396514] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=242 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.422556] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2daudit_2esocket interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=243 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 203.448556] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/job/478 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=244 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 207.768266] systemd[1]: systemd-journald.service: State 'stop-final-sigterm' timed out. Killing. [ 207.777203] systemd[1]: systemd-journald.service: Killing process 191 (systemd-journal) with signal SIGKILL. [ 207.787289] systemd[1]: systemd-journald.service: Changed final-sigterm -> final-sigkill [ 207.795732] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=245 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 207.821137] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=246 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 207.846699] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=247 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 207.872100] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=248 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 207.897390] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/job/478 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=249 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 208.477176] systemd[1]: systemd-resolved.service: Got notification message from PID 241 (WATCHDOG=1) [ 213.018260] systemd[1]: systemd-journald.service: Processes still around after final SIGKILL. Entering failed mode. [ 213.028754] systemd[1]: systemd-journald.service: Failed with result 'watchdog'. [ 213.036153] systemd[1]: systemd-journald.service: Service will restart (restart setting) [ 213.044281] systemd[1]: systemd-journald.service: Changed final-sigkill -> failed [ 213.051897] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=250 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.074440] systemd[1]: systemd-journald.service: Failed to destroy cgroup /system.slice/systemd-journald.service, ignoring: Device or resource busy [ 213.087902] systemd[1]: systemd-journald.service: Unit entered failed state. [ 213.095299] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=251 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.120751] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=252 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.145983] systemd[1]: systemd-journald.service: Changed failed -> auto-restart [ 213.153460] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=253 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.176026] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=254 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.201420] systemd[1]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/systemd_2djournald_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=255 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a [ 213.226727] systemd[1]: systemd-journald.service: Service has no hold-off time (RestartSec=0), scheduling restart. [ 213.237094] systemd[1]: systemd-journald.service: Trying to enqueue job systemd-journald.service/restart/replace [ 213.247294] systemd[1]: Added job systemd-journald.service/restart to transaction. [ 213.254867] systemd[1]: Pulling in system.slice/start from systemd-journald.service/restart [ 213.263233] systemd[1]: Added job system.slice/start to transaction. [ 213.269604] systemd[1]: Pulling in -.slice/start from system.slice/start [ 213.276307] systemd[1]: Added job -.slice/start to transaction. [ 213.282249] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/restart [ 213.291576] systemd[1]: Added job systemd-journald.socket/start to transaction. [ 213.298898] systemd[1]: Pulling in system.slice/start from systemd-journald.socket/start [ 213.307002] systemd[1]: Pulling in -.mount/start from systemd-journald.socket/start [ 213.314660] systemd[1]: Added job -.mount/start to transaction. [ 213.320592] systemd[1]: Pulling in system.slice/start from -.mount/start [ 213.327308] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/restart [ 213.336628] systemd[1]: Pulling in systemd-journald-dev-log.socket/start from systemd-journald.service/restart [ 213.346641] systemd[1]: Added job systemd-journald-dev-log.socket/start to transaction. [ 213.354646] systemd[1]: Pulling in system.slice/start from systemd-journald-dev-log.socket/start [ 213.363444] systemd[1]: Pulling in -.mount/start from systemd-journald-dev-log.socket/start [ 213.371808] systemd[1]: Pulling in systemd-journald-audit.socket/start from systemd-journald.service/restart [ 213.381649] systemd[1]: Added job systemd-journald-audit.socket/start to transaction. [ 213.389491] systemd[1]: Pulling in system.slice/start from systemd-journald-audit.socket/start [ 213.398130] systemd[1]: Pulling in systemd-journal-flush.service/restart from systemd-journald.service/restart [ 213.408150] systemd[1]: Added job systemd-journal-flush.service/restart to transaction. [ 213.416163] systemd[1]: Pulling in var.mount/start from systemd-journal-flush.service/restart [ 213.424702] systemd[1]: Added job var.mount/start to transaction. [ 213.430816] systemd[1]: Pulling in system.slice/start from var.mount/start [ 213.437705] systemd[1]: Pulling in -.mount/start from var.mount/start [ 213.444155] systemd[1]: Pulling in dev-mmcblk0p2.device/start from var.mount/start [ 213.451738] systemd[1]: Added job dev-mmcblk0p2.device/start to transaction. [ 213.458818] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from dev-mmcblk0p2.device/start [ 213.472133] systemd[1]: Added job dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start to transaction. [ 213.483977] systemd[1]: Pulling in dev-mmcblk0p2.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 213.497296] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 213.516884] systemd[1]: Added job sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start to transaction. [ 213.530208] systemd[1]: Pulling in dev-mmcblk0p2.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 213.545027] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 213.564599] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 213.582691] systemd[1]: Added job dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start to transaction. [ 213.593060] systemd[1]: Pulling in dev-mmcblk0p2.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.604900] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.621518] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.639609] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.653617] systemd[1]: Added job dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start to transaction. [ 213.662857] systemd[1]: Pulling in dev-mmcblk0p2.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.673568] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.689065] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.706028] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.720038] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.732049] systemd[1]: Added job dev-disk-by\x2dlabel-otaroot.device/start to transaction. [ 213.740419] systemd[1]: Pulling in dev-mmcblk0p2.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.750261] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.764934] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.778078] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.794173] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.806187] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from dev-disk-by\x2dlabel-otaroot.device/start [ 213.819502] systemd[1]: Added job dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start to transaction. [ 213.830046] systemd[1]: Pulling in dev-mmcblk0p2.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.842060] systemd[1]: Pulling in dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.858849] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.874162] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.892430] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.905745] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start [ 213.919935] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start [ 213.934122] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.947271] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start [ 213.962587] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 213.979549] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 213.995645] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start [ 214.013914] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 214.030531] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 214.046018] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 214.060637] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start [ 214.077429] systemd[1]: Pulling in sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start from dev-mmcblk0p2.device/start [ 214.092222] systemd[1]: Pulling in dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start from dev-mmcblk0p2.device/start [ 214.104060] systemd[1]: Pulling in dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start from dev-mmcblk0p2.device/start [ 214.114769] systemd[1]: Pulling in dev-disk-by\x2dlabel-otaroot.device/start from dev-mmcblk0p2.device/start [ 214.124611] systemd[1]: Pulling in dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start from dev-mmcblk0p2.device/start [ 214.136626] systemd[1]: Pulling in umount.target/stop from var.mount/start [ 214.143506] systemd[1]: Added job umount.target/stop to transaction. [ 214.149875] systemd[1]: Pulling in -.mount/start from systemd-journal-flush.service/restart [ 214.158240] systemd[1]: Pulling in system.slice/start from systemd-journal-flush.service/restart [ 214.167038] systemd[1]: Pulling in systemd-journald.service/start from systemd-journal-flush.service/restart [ 214.176880] systemd[1]: Added job systemd-journald.service/start to transaction. [ 214.184278] systemd[1]: Pulling in system.slice/start from systemd-journald.service/start [ 214.192468] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/start [ 214.201615] systemd[1]: Pulling in systemd-journald.socket/start from systemd-journald.service/start [ 214.210769] systemd[1]: Pulling in systemd-journald-dev-log.socket/start from systemd-journald.service/start [ 214.220610] systemd[1]: Pulling in systemd-journald-audit.socket/start from systemd-journald.service/start [ 214.230297] systemd[1]: Found redundant job dev-disk-by\x2dpartuuid-d1a0857f\x2d02.device/start, dropping from transaction. [ 214.241451] systemd[1]: Found redundant job dev-disk-by\x2dlabel-otaroot.device/start, dropping from transaction. [ 214.251730] systemd[1]: Found redundant job -.slice/start, dropping from transaction. [ 214.259575] systemd[1]: Found redundant job umount.target/stop, dropping from transaction. [ 214.267853] systemd[1]: Found redundant job var.mount/start, dropping from transaction. [ 214.275862] systemd[1]: Found redundant job dev-disk-by\x2did-mmc\x2dBJTD4R_0x925fd89b\x2dpart2.device/start, dropping from transaction. [ 214.288137] systemd[1]: Found redundant job -.mount/start, dropping from transaction. [ 214.295975] systemd[1]: Found redundant job systemd-journald-dev-log.socket/start, dropping from transaction. [ 214.305907] systemd[1]: Found redundant job sys-devices-platform-soc-3f202000.mmc-mmc_host-mmc0-mmc0:0001-block-mmcblk0-mmcblk0p2.device/start, dropping from transaction. [ 214.321136] systemd[1]: Found redundant job dev-mmcblk0p2.device/start, dropping from transaction. [ 214.330110] systemd[1]: Found redundant job dev-disk-by\x2dpath-platform\x2d3f202000.mmc\x2dpart2.device/start, dropping from transaction. [ 214.342557] systemd[1]: Found redundant job dev-disk-by\x2duuid-41f0671b\x2d4c41\x2d4cfb\x2d96c0\x2d1fa0fb62a74d.device/start, dropping from transaction. [ 214.356307] systemd[1]: Found redundant job system.slice/start, dropping from transaction. [ 214.364588] systemd[1]: Found redundant job systemd-journald.socket/start, dropping from transaction. [ 214.373823] systemd[1]: Keeping job systemd-journal-flush.service/restart because of systemd-journald.service/restart [ 214.384467] systemd[1]: Keeping job systemd-journald.service/start because of systemd-journal-flush.service/restart [ 214.394915] systemd[1]: Keeping job systemd-journald-audit.socket/start because of systemd-journald.service/start [ 214.405235] systemd[1]: systemd-journal-flush.service: Installed new job systemd-journal-flush.service/restart as 492 [ 214.415887] systemd[1]: systemd-journald.service: Merged systemd-journald.service/restart into installed job systemd-journald.service/restart as 478 [ 214.429206] systemd[1]: systemd-journald-audit.socket: Installed new job systemd-journald-audit.socket/start as 491 [ 214.439659] systemd[1]: systemd-journald.service: Enqueued job systemd-journald.service/restart as 478 [ 214.448986] systemd[1]: systemd-journald.service: Scheduled restart job, restart counter is at 1. [ 243.676593] INFO: task kworker/2:0:22 blocked for more than 120 seconds. [ 243.683307] Tainted: G C 4.19.126-v7 #1 [ 243.688982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.696815] kworker/2:0 D 0 22 2 0x00000000 [ 243.702307] Workqueue: events_freezable mmc_rescan [ 243.707115] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 243.714160] [<8085d6b4>] (schedule) from [<806b18d8>] (__mmc_claim_host+0xe4/0x238) [ 243.721827] [<806b18d8>] (__mmc_claim_host) from [<806b1a64>] (mmc_get_card+0x38/0x3c) [ 243.729751] [<806b1a64>] (mmc_get_card) from [<806b859c>] (mmc_detect+0x24/0x7c) [ 243.737158] [<806b859c>] (mmc_detect) from [<806b3f14>] (mmc_rescan+0xe8/0x3f8) [ 243.744466] [<806b3f14>] (mmc_rescan) from [<8013c3a4>] (process_one_work+0x200/0x4e0) [ 243.752390] [<8013c3a4>] (process_one_work) from [<8013d470>] (worker_thread+0x4c/0x568) [ 243.760487] [<8013d470>] (worker_thread) from [<80142f64>] (kthread+0x140/0x170) [ 243.767889] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 243.775104] Exception stack(0xb7159fb0 to 0xb7159ff8) [ 243.780158] 9fa0: 00000000 00000000 00000000 00000000 [ 243.788339] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 243.796510] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 243.803157] INFO: task kswapd0:42 blocked for more than 120 seconds. [ 243.809514] Tainted: G C 4.19.126-v7 #1 [ 243.815165] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 243.822993] kswapd0 D 0 42 2 0x00000000 [ 243.828498] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 243.835543] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 243.842687] [<8085dbe4>] (io_schedule) from [<804a455c>] (blk_mq_get_tag+0x180/0x2e0) [ 243.850525] [<804a455c>] (blk_mq_get_tag) from [<8049e87c>] (blk_mq_get_request+0xf8/0x3b0) [ 243.858887] [<8049e87c>] (blk_mq_get_request) from [<804a1c10>] (blk_mq_make_request+0xf8/0x520) [ 243.867681] [<804a1c10>] (blk_mq_make_request) from [<804938f8>] (generic_make_request+0x154/0x38c) [ 243.876743] [<804938f8>] (generic_make_request) from [<80493b78>] (submit_bio+0x48/0x1b8) [ 243.884919] [<80493b78>] (submit_bio) from [<802af1a4>] (__swap_writepage+0x168/0x3c0) [ 243.892843] [<802af1a4>] (__swap_writepage) from [<802af43c>] (swap_writepage+0x40/0x90) [ 243.900945] [<802af43c>] (swap_writepage) from [<80279e00>] (shrink_page_list+0x828/0xfb4) [ 243.909215] [<80279e00>] (shrink_page_list) from [<8027ad6c>] (shrink_inactive_list+0x244/0x714) [ 243.918007] [<8027ad6c>] (shrink_inactive_list) from [<8027bab4>] (shrink_node_memcg+0x394/0x770) [ 243.926886] [<8027bab4>] (shrink_node_memcg) from [<8027bf7c>] (shrink_node+0xec/0x4f0) [ 243.934885] [<8027bf7c>] (shrink_node) from [<8027d474>] (kswapd+0x41c/0x808) [ 243.942025] [<8027d474>] (kswapd) from [<80142f64>] (kthread+0x140/0x170) [ 243.948818] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 243.956033] Exception stack(0xb73d7fb0 to 0xb73d7ff8) [ 243.961087] 7fa0: 00000000 00000000 00000000 00000000 [ 243.969267] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 243.977446] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 243.984085] INFO: task kworker/u8:1:76 blocked for more than 120 seconds. [ 243.984090] Tainted: G C 4.19.126-v7 #1 [ 243.996530] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.004361] kworker/u8:1 D 0 76 2 0x00000000 [ 244.009857] Workqueue: writeback wb_workfn (flush-179:0) [ 244.015174] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 244.022227] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 244.029370] [<8085dbe4>] (io_schedule) from [<804a455c>] (blk_mq_get_tag+0x180/0x2e0) [ 244.037207] [<804a455c>] (blk_mq_get_tag) from [<8049e87c>] (blk_mq_get_request+0xf8/0x3b0) [ 244.045557] [<8049e87c>] (blk_mq_get_request) from [<804a1c10>] (blk_mq_make_request+0xf8/0x520) [ 244.054350] [<804a1c10>] (blk_mq_make_request) from [<804938f8>] (generic_make_request+0x154/0x38c) [ 244.063408] [<804938f8>] (generic_make_request) from [<80493b78>] (submit_bio+0x48/0x1b8) [ 244.071592] [<80493b78>] (submit_bio) from [<803b4d78>] (ext4_io_submit+0x58/0x64) [ 244.079170] [<803b4d78>] (ext4_io_submit) from [<803b4fbc>] (ext4_bio_write_page+0x214/0x600) [ 244.087708] [<803b4fbc>] (ext4_bio_write_page) from [<8038fc54>] (mpage_submit_page+0xa0/0xd0) [ 244.096317] [<8038fc54>] (mpage_submit_page) from [<8038fe0c>] (mpage_process_page_bufs+0x188/0x198) [ 244.105457] [<8038fe0c>] (mpage_process_page_bufs) from [<8038fff4>] (mpage_prepare_extent_to_map+0x1d8/0x310) [ 244.115472] [<8038fff4>] (mpage_prepare_extent_to_map) from [<80397214>] (ext4_writepages+0x438/0x900) [ 244.124787] [<80397214>] (ext4_writepages) from [<8026fc4c>] (do_writepages+0x58/0xfc) [ 244.132711] [<8026fc4c>] (do_writepages) from [<80305674>] (__writeback_single_inode+0x44/0x458) [ 244.141507] [<80305674>] (__writeback_single_inode) from [<8030600c>] (writeback_sb_inodes+0x22c/0x4e4) [ 244.150906] [<8030600c>] (writeback_sb_inodes) from [<80306314>] (__writeback_inodes_wb+0x50/0xe4) [ 244.159871] [<80306314>] (__writeback_inodes_wb) from [<8030664c>] (wb_writeback+0x2a4/0x364) [ 244.168401] [<8030664c>] (wb_writeback) from [<8030716c>] (wb_workfn+0x3f8/0x558) [ 244.175882] [<8030716c>] (wb_workfn) from [<8013c3a4>] (process_one_work+0x200/0x4e0) [ 244.183720] [<8013c3a4>] (process_one_work) from [<8013d470>] (worker_thread+0x4c/0x568) [ 244.191829] [<8013d470>] (worker_thread) from [<80142f64>] (kthread+0x140/0x170) [ 244.199237] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 244.206453] Exception stack(0x82d55fb0 to 0x82d55ff8) [ 244.211508] 5fa0: 00000000 00000000 00000000 00000000 [ 244.219690] 5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 244.227869] 5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 244.234504] INFO: task kworker/3:1H:160 blocked for more than 120 seconds. [ 244.241392] Tainted: G C 4.19.126-v7 #1 [ 244.247051] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.254874] kworker/3:1H D 0 160 2 0x00000000 [ 244.260371] Workqueue: kblockd blk_mq_run_work_fn [ 244.265080] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 244.272135] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 244.279278] [<8085dbe4>] (io_schedule) from [<8025c96c>] (__lock_page+0x120/0x180) [ 244.286862] [<8025c96c>] (__lock_page) from [<802c18e0>] (migrate_pages+0x504/0x95c) [ 244.294608] [<802c18e0>] (migrate_pages) from [<8026b328>] (alloc_contig_range+0x1c0/0x3b4) [ 244.302971] [<8026b328>] (alloc_contig_range) from [<802cca74>] (cma_alloc+0x128/0x2c8) [ 244.310984] [<802cca74>] (cma_alloc) from [<801950b8>] (dma_alloc_from_contiguous+0x48/0x4c) [ 244.319466] [<801950b8>] (dma_alloc_from_contiguous) from [<801181e8>] (__alloc_from_contiguous.constprop.0+0x48/0xc4) [ 244.330171] [<801181e8>] (__alloc_from_contiguous.constprop.0) from [<801182a0>] (cma_allocator_alloc+0x3c/0x44) [ 244.340352] [<801182a0>] (cma_allocator_alloc) from [<80117648>] (__dma_alloc+0x158/0x320) [ 244.348629] [<80117648>] (__dma_alloc) from [<801178a8>] (arm_dma_alloc+0x50/0x58) [ 244.356197] [<801178a8>] (arm_dma_alloc) from [<802b7798>] (dma_pool_alloc+0x180/0x2b0) [ 244.364211] [<802b7798>] (dma_pool_alloc) from [<8052c2f4>] (bcm2835_dma_create_cb_chain+0x248/0x314) [ 244.373438] [<8052c2f4>] (bcm2835_dma_create_cb_chain) from [<8052c8a0>] (bcm2835_dma_prep_slave_sg+0x140/0x324) [ 244.383624] [<8052c8a0>] (bcm2835_dma_prep_slave_sg) from [<806d2280>] (bcm2835_sdhost_request+0x244/0x728) [ 244.393372] [<806d2280>] (bcm2835_sdhost_request) from [<806b12c4>] (__mmc_start_request+0x88/0x190) [ 244.402511] [<806b12c4>] (__mmc_start_request) from [<806b21a4>] (mmc_start_request+0x94/0xb8) [ 244.411129] [<806b21a4>] (mmc_start_request) from [<806c5c5c>] (mmc_blk_mq_issue_rq+0x2f8/0x8dc) [ 244.419921] [<806c5c5c>] (mmc_blk_mq_issue_rq) from [<806c66ec>] (mmc_mq_queue_rq+0x150/0x284) [ 244.428541] [<806c66ec>] (mmc_mq_queue_rq) from [<804a1220>] (blk_mq_dispatch_rq_list+0xd4/0x5c4) [ 244.437421] [<804a1220>] (blk_mq_dispatch_rq_list) from [<804a5e44>] (blk_mq_do_dispatch_sched+0x70/0x114) [ 244.447081] [<804a5e44>] (blk_mq_do_dispatch_sched) from [<804a6610>] (blk_mq_sched_dispatch_requests+0x14c/0x1b4) [ 244.457436] [<804a6610>] (blk_mq_sched_dispatch_requests) from [<8049f7f8>] (__blk_mq_run_hw_queue+0xb4/0x15c) [ 244.467444] [<8049f7f8>] (__blk_mq_run_hw_queue) from [<8049fa80>] (blk_mq_run_work_fn+0x28/0x2c) [ 244.476314] [<8049fa80>] (blk_mq_run_work_fn) from [<8013c3a4>] (process_one_work+0x200/0x4e0) [ 244.484938] [<8013c3a4>] (process_one_work) from [<8013d470>] (worker_thread+0x4c/0x568) [ 244.493037] [<8013d470>] (worker_thread) from [<80142f64>] (kthread+0x140/0x170) [ 244.500439] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 244.507664] Exception stack(0xb617bfb0 to 0xb617bff8) [ 244.512709] bfa0: 00000000 00000000 00000000 00000000 [ 244.520889] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 244.529071] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 244.535702] INFO: task jbd2/mmcblk0p2-:162 blocked for more than 120 seconds. [ 244.535707] Tainted: G C 4.19.126-v7 #1 [ 244.548565] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.556388] jbd2/mmcblk0p2- D 0 162 2 0x00000000 [ 244.561893] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 244.568947] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 244.576082] [<8085dbe4>] (io_schedule) from [<8025c418>] (wait_on_page_bit+0x12c/0x168) [ 244.584098] [<8025c418>] (wait_on_page_bit) from [<8025c53c>] (__filemap_fdatawait_range+0xe8/0x138) [ 244.593238] [<8025c53c>] (__filemap_fdatawait_range) from [<8025c5ec>] (filemap_fdatawait_range_keep_errors+0x28/0x54) [ 244.603941] [<8025c5ec>] (filemap_fdatawait_range_keep_errors) from [<803def54>] (jbd2_journal_commit_transaction+0xa34/0x197c) [ 244.615427] [<803def54>] (jbd2_journal_commit_transaction) from [<803e57e0>] (kjournald2+0xfc/0x2f0) [ 244.624567] [<803e57e0>] (kjournald2) from [<80142f64>] (kthread+0x140/0x170) [ 244.631710] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 244.638934] Exception stack(0x82fe1fb0 to 0x82fe1ff8) [ 244.643979] 1fa0: 00000000 00000000 00000000 00000000 [ 244.652177] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 244.660358] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 244.666985] INFO: task systemd-journal:191 blocked for more than 120 seconds. [ 244.674114] Tainted: G C 4.19.126-v7 #1 [ 244.679776] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 244.687608] systemd-journal D 0 191 1 0x00000001 [ 244.693101] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 244.700157] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 244.707298] [<8085dbe4>] (io_schedule) from [<8085e090>] (bit_wait_io+0x1c/0x68) [ 244.714691] [<8085e090>] (bit_wait_io) from [<8085dc74>] (__wait_on_bit+0x70/0xcc) [ 244.722265] [<8085dc74>] (__wait_on_bit) from [<8085dd68>] (out_of_line_wait_on_bit+0x98/0xbc) [ 244.730883] [<8085dd68>] (out_of_line_wait_on_bit) from [<803dc640>] (do_get_write_access+0x394/0x548) [ 244.740195] [<803dc640>] (do_get_write_access) from [<803dc874>] (jbd2_journal_get_write_access+0x80/0xa0) [ 244.749856] [<803dc874>] (jbd2_journal_get_write_access) from [<80376474>] (__ext4_journal_get_write_access+0x44/0x80) [ 244.760559] [<80376474>] (__ext4_journal_get_write_access) from [<803961dc>] (ext4_reserve_inode_write+0x84/0xb8) [ 244.770828] [<803961dc>] (ext4_reserve_inode_write) from [<80396430>] (ext4_mark_inode_dirty+0x5c/0x200) [ 244.780314] [<80396430>] (ext4_mark_inode_dirty) from [<8039be54>] (ext4_dirty_inode+0x54/0x70) [ 244.789020] [<8039be54>] (ext4_dirty_inode) from [<80305268>] (__mark_inode_dirty+0x50/0x418) [ 244.797553] [<80305268>] (__mark_inode_dirty) from [<802efe50>] (generic_update_time+0xf4/0x11c) [ 244.806337] [<802efe50>] (generic_update_time) from [<802f033c>] (file_update_time+0x130/0x160) [ 244.815041] [<802f033c>] (file_update_time) from [<8039c09c>] (ext4_page_mkwrite+0x7c/0x578) [ 244.823487] [<8039c09c>] (ext4_page_mkwrite) from [<802982f8>] (do_page_mkwrite+0x4c/0xf8) [ 244.831757] [<802982f8>] (do_page_mkwrite) from [<8029b7f8>] (do_wp_page+0x2d0/0x610) [ 244.839592] [<8029b7f8>] (do_wp_page) from [<8029ce90>] (handle_mm_fault+0x6f4/0xd30) [ 244.847430] [<8029ce90>] (handle_mm_fault) from [<808632ac>] (do_page_fault+0x154/0x3a0) [ 244.855533] [<808632ac>] (do_page_fault) from [<801168ac>] (do_DataAbort+0x60/0x108) [ 244.863285] [<801168ac>] (do_DataAbort) from [<80101ca4>] (__dabt_usr+0x44/0x60) [ 244.870685] Exception stack(0xb6145fb0 to 0xb6145ff8) [ 244.875730] 5fa0: 752545e0 755cbcd0 00000000 018e0d48 [ 244.883910] 5fc0: 0133baf8 003776f0 0131cbff 00000000 00000001 7e9bc200 0000002f 00000000 [ 244.892089] 5fe0: 00000000 7e9bc1a8 76e71415 76e71428 40000030 ffffffff [ 366.556631] INFO: task systemd:1 blocked for more than 120 seconds. [ 366.562913] Tainted: G C 4.19.126-v7 #1 [ 366.568601] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 366.576429] systemd D 0 1 0 0x00000000 [ 366.581940] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 366.588995] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 366.596129] [<8085dbe4>] (io_schedule) from [<804a455c>] (blk_mq_get_tag+0x180/0x2e0) [ 366.603967] [<804a455c>] (blk_mq_get_tag) from [<8049e87c>] (blk_mq_get_request+0xf8/0x3b0) [ 366.612325] [<8049e87c>] (blk_mq_get_request) from [<804a1c10>] (blk_mq_make_request+0xf8/0x520) [ 366.621118] [<804a1c10>] (blk_mq_make_request) from [<804938f8>] (generic_make_request+0x154/0x38c) [ 366.630171] [<804938f8>] (generic_make_request) from [<80493b78>] (submit_bio+0x48/0x1b8) [ 366.638355] [<80493b78>] (submit_bio) from [<802af1a4>] (__swap_writepage+0x168/0x3c0) [ 366.646269] [<802af1a4>] (__swap_writepage) from [<802af43c>] (swap_writepage+0x40/0x90) [ 366.654365] [<802af43c>] (swap_writepage) from [<80279e00>] (shrink_page_list+0x828/0xfb4) [ 366.662637] [<80279e00>] (shrink_page_list) from [<8027ad6c>] (shrink_inactive_list+0x244/0x714) [ 366.671429] [<8027ad6c>] (shrink_inactive_list) from [<8027bab4>] (shrink_node_memcg+0x394/0x770) [ 366.680309] [<8027bab4>] (shrink_node_memcg) from [<8027bf7c>] (shrink_node+0xec/0x4f0) [ 366.688317] [<8027bf7c>] (shrink_node) from [<8027c4a8>] (do_try_to_free_pages+0x128/0x3f4) [ 366.696673] [<8027c4a8>] (do_try_to_free_pages) from [<8027c954>] (try_to_free_pages+0x1e0/0x4a4) [ 366.705542] [<8027c954>] (try_to_free_pages) from [<8026961c>] (__alloc_pages_nodemask+0x5f8/0x111c) [ 366.714683] [<8026961c>] (__alloc_pages_nodemask) from [<802b99a8>] (new_slab+0x488/0x638) [ 366.722954] [<802b99a8>] (new_slab) from [<802bb41c>] (___slab_alloc.constprop.0+0x358/0x54c) [ 366.731486] [<802bb41c>] (___slab_alloc.constprop.0) from [<802bb654>] (__slab_alloc.constprop.0+0x44/0x90) [ 366.741231] [<802bb654>] (__slab_alloc.constprop.0) from [<802bf160>] (__kmalloc_track_caller+0x180/0x2c8) [ 366.750892] [<802bf160>] (__kmalloc_track_caller) from [<8074a010>] (__kmalloc_reserve.constprop.0+0x34/0x78) [ 366.760817] [<8074a010>] (__kmalloc_reserve.constprop.0) from [<8074a0d0>] (__alloc_skb+0x7c/0x16c) [ 366.769870] [<8074a0d0>] (__alloc_skb) from [<8074adf4>] (alloc_skb_with_frags+0x50/0x1a0) [ 366.778139] [<8074adf4>] (alloc_skb_with_frags) from [<80743380>] (sock_alloc_send_pskb+0x21c/0x23c) [ 366.787282] [<80743380>] (sock_alloc_send_pskb) from [<80833218>] (unix_stream_sendmsg+0x228/0x3a8) [ 366.796330] [<80833218>] (unix_stream_sendmsg) from [<8073e830>] (sock_sendmsg+0x24/0x34) [ 366.804513] [<8073e830>] (sock_sendmsg) from [<8073f130>] (___sys_sendmsg+0x248/0x25c) [ 366.812434] [<8073f130>] (___sys_sendmsg) from [<807402e8>] (__sys_sendmsg+0x64/0xa8) [ 366.820275] [<807402e8>] (__sys_sendmsg) from [<80740348>] (sys_sendmsg+0x1c/0x20) [ 366.827851] [<80740348>] (sys_sendmsg) from [<80101000>] (ret_fast_syscall+0x0/0x28) [ 366.835586] Exception stack(0xb70edfa8 to 0xb70edff0) [ 366.840641] dfa0: 00004040 00000035 00000035 7ed72840 00004040 00000000 [ 366.848821] dfc0: 00004040 00000035 76f694d0 00000128 00000002 7ed727c0 7ed72840 00000000 [ 366.856999] dfe0: 00000128 7ed72798 76d3c929 76d3db26 [ 366.862059] INFO: task kworker/2:0:22 blocked for more than 120 seconds. [ 366.868761] Tainted: G C 4.19.126-v7 #1 [ 366.874412] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 366.882242] kworker/2:0 D 0 22 2 0x00000000 [ 366.887737] Workqueue: events_freezable mmc_rescan [ 366.892531] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 366.899586] [<8085d6b4>] (schedule) from [<806b18d8>] (__mmc_claim_host+0xe4/0x238) [ 366.907247] [<806b18d8>] (__mmc_claim_host) from [<806b1a64>] (mmc_get_card+0x38/0x3c) [ 366.915161] [<806b1a64>] (mmc_get_card) from [<806b859c>] (mmc_detect+0x24/0x7c) [ 366.922562] [<806b859c>] (mmc_detect) from [<806b3f14>] (mmc_rescan+0xe8/0x3f8) [ 366.929878] [<806b3f14>] (mmc_rescan) from [<8013c3a4>] (process_one_work+0x200/0x4e0) [ 366.937801] [<8013c3a4>] (process_one_work) from [<8013d470>] (worker_thread+0x4c/0x568) [ 366.945889] [<8013d470>] (worker_thread) from [<80142f64>] (kthread+0x140/0x170) [ 366.953290] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 366.960513] Exception stack(0xb7159fb0 to 0xb7159ff8) [ 366.965558] 9fa0: 00000000 00000000 00000000 00000000 [ 366.973738] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 366.981919] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 366.988542] INFO: task kworker/0:1:31 blocked for more than 120 seconds. [ 366.995236] Tainted: G C 4.19.126-v7 #1 [ 367.000893] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 367.008722] kworker/0:1 D 0 31 2 0x00000000 [ 367.014208] Workqueue: events usbnet_deferred_kevent [ 367.019181] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 367.026226] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 367.033367] [<8085dbe4>] (io_schedule) from [<804a455c>] (blk_mq_get_tag+0x180/0x2e0) [ 367.041203] [<804a455c>] (blk_mq_get_tag) from [<8049e87c>] (blk_mq_get_request+0xf8/0x3b0) [ 367.049562] [<8049e87c>] (blk_mq_get_request) from [<804a1c10>] (blk_mq_make_request+0xf8/0x520) [ 367.058355] [<804a1c10>] (blk_mq_make_request) from [<804938f8>] (generic_make_request+0x154/0x38c) [ 367.067408] [<804938f8>] (generic_make_request) from [<80493b78>] (submit_bio+0x48/0x1b8) [ 367.075583] [<80493b78>] (submit_bio) from [<802af1a4>] (__swap_writepage+0x168/0x3c0) [ 367.083510] [<802af1a4>] (__swap_writepage) from [<802af43c>] (swap_writepage+0x40/0x90) [ 367.091609] [<802af43c>] (swap_writepage) from [<80279e00>] (shrink_page_list+0x828/0xfb4) [ 367.099878] [<80279e00>] (shrink_page_list) from [<8027ad6c>] (shrink_inactive_list+0x244/0x714) [ 367.108668] [<8027ad6c>] (shrink_inactive_list) from [<8027bab4>] (shrink_node_memcg+0x394/0x770) [ 367.117546] [<8027bab4>] (shrink_node_memcg) from [<8027bf7c>] (shrink_node+0xec/0x4f0) [ 367.125546] [<8027bf7c>] (shrink_node) from [<8027c4a8>] (do_try_to_free_pages+0x128/0x3f4) [ 367.133901] [<8027c4a8>] (do_try_to_free_pages) from [<8027c954>] (try_to_free_pages+0x1e0/0x4a4) [ 367.142778] [<8027c954>] (try_to_free_pages) from [<8026961c>] (__alloc_pages_nodemask+0x5f8/0x111c) [ 367.151918] [<8026961c>] (__alloc_pages_nodemask) from [<8028e988>] (kmalloc_order+0x30/0x70) [ 367.160447] [<8028e988>] (kmalloc_order) from [<8028e9f4>] (kmalloc_order_trace+0x2c/0xd0) [ 367.168716] [<8028e9f4>] (kmalloc_order_trace) from [<802bf23c>] (__kmalloc_track_caller+0x25c/0x2c8) [ 367.177948] [<802bf23c>] (__kmalloc_track_caller) from [<8074a010>] (__kmalloc_reserve.constprop.0+0x34/0x78) [ 367.187874] [<8074a010>] (__kmalloc_reserve.constprop.0) from [<8074a0d0>] (__alloc_skb+0x7c/0x16c) [ 367.196927] [<8074a0d0>] (__alloc_skb) from [<8074a4b4>] (__netdev_alloc_skb+0x4c/0x154) [ 367.205016] [<8074a4b4>] (__netdev_alloc_skb) from [<8063d0e4>] (rx_submit+0xdc/0x238) [ 367.212939] [<8063d0e4>] (rx_submit) from [<8063e6a0>] (usbnet_deferred_kevent+0x9c/0x330) [ 367.221211] [<8063e6a0>] (usbnet_deferred_kevent) from [<8013c3a4>] (process_one_work+0x200/0x4e0) [ 367.230179] [<8013c3a4>] (process_one_work) from [<8013d470>] (worker_thread+0x4c/0x568) [ 367.238274] [<8013d470>] (worker_thread) from [<80142f64>] (kthread+0x140/0x170) [ 367.245668] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 367.252890] Exception stack(0xb7267fb0 to 0xb7267ff8) [ 367.257945] 7fa0: 00000000 00000000 00000000 00000000 [ 367.266117] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 367.274296] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 367.280924] INFO: task kswapd0:42 blocked for more than 120 seconds. [ 367.287283] Tainted: G C 4.19.126-v7 #1 [ 367.292934] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 367.300765] kswapd0 D 0 42 2 0x00000000 [ 367.306256] [<8085cf3c>] (__schedule) from [<8085d6b4>] (schedule+0x50/0xa8) [ 367.313309] [<8085d6b4>] (schedule) from [<8085dbe4>] (io_schedule+0x20/0x40) [ 367.320449] [<8085dbe4>] (io_schedule) from [<804a455c>] (blk_mq_get_tag+0x180/0x2e0) [ 367.328286] [<804a455c>] (blk_mq_get_tag) from [<8049e87c>] (blk_mq_get_request+0xf8/0x3b0) [ 367.336653] [<8049e87c>] (blk_mq_get_request) from [<804a1c10>] (blk_mq_make_request+0xf8/0x520) [ 367.345439] [<804a1c10>] (blk_mq_make_request) from [<804938f8>] (generic_make_request+0x154/0x38c) [ 367.354491] [<804938f8>] (generic_make_request) from [<80493b78>] (submit_bio+0x48/0x1b8) [ 367.362675] [<80493b78>] (submit_bio) from [<802af1a4>] (__swap_writepage+0x168/0x3c0) [ 367.370597] [<802af1a4>] (__swap_writepage) from [<802af43c>] (swap_writepage+0x40/0x90) [ 367.378692] [<802af43c>] (swap_writepage) from [<80279e00>] (shrink_page_list+0x828/0xfb4) [ 367.386961] [<80279e00>] (shrink_page_list) from [<8027ad6c>] (shrink_inactive_list+0x244/0x714) [ 367.395742] [<8027ad6c>] (shrink_inactive_list) from [<8027bab4>] (shrink_node_memcg+0x394/0x770) [ 367.404617] [<8027bab4>] (shrink_node_memcg) from [<8027bf7c>] (shrink_node+0xec/0x4f0) [ 367.412625] [<8027bf7c>] (shrink_node) from [<8027d474>] (kswapd+0x41c/0x808) [ 367.419764] [<8027d474>] (kswapd) from [<80142f64>] (kthread+0x140/0x170) [ 367.426549] [<80142f64>] (kthread) from [<801010ac>] (ret_from_fork+0x14/0x28) [ 367.433771] Exception stack(0xb73d7fb0 to 0xb73d7ff8) [ 367.438824] 7fa0: 00000000 00000000 00000000 00000000 [ 367.447006] 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 367.455177] 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ```

One could suspect undervoltage here, but it happens also in the (Compute Module IO Board)[https://www.raspberrypi.com/products /compute-module-io-board-v3] with the official power supply (as well as in our own custom hardware).

Questions

  1. What can make the mmc get stuck in mmc_rescan?
  2. Will upgrading the kernel likely help? Some people report this issue with 5.x too, and upgrading will be hard for us for various reasons.
  3. And most importantly: How to continue debugging this?

Also thank you all for the work on this layer. Hope this is the (a?) right place to ask.

agherzan commented 2 years ago

I would firstly try to reproduce this with an io stress program. Once that is easy to trigger, attach a serial and reproduce over serial and see if that freezes too. Try to see if that gives you a bit more insight. Nevertheless, this is most probably a kernel bug so I'd encourage you to raise it there as well. Have you tried to reproduce it on later kernel versions? If it works, we could try to bisect too.

sahib commented 2 years ago

Hi @agherzan,

thanks for the input. I tried to run stress-ng --hdd 16 --io 16 on the system. This caused quite a bit of lockup, but nothing fatal yet. I still could connect and (very slowly) execute file system related commands. Maybe it's more caused by specific IO patterns? I will do more testing here.

What had more effect though was adding this dtoverlay (I guess you can do it also with dtparam in config.txt [1]):

/dts-v1/;
/plugin/;

/ {
    compatible = "brcm,bcm2708";
    fragment@0 {
        target = <&sdhost>;
        __overlay__ {
            non-removable;
                        brcm,force-pio;
        };
    };
};

This disables DMA in the sdhost driver and also disables the periodic mmc_rescan check that seemed to be part of the freeze. This had the effect of having ~3-4 freezes in a day on a few test systems to only having 1 in 2 days. Could be due to the slower I/O in general though due to disabled DMA. I will check what half of the change actually triggered the benefits.

Have you tried to reproduce it on later kernel versions?

Not yet, we had a bit of trouble updating to 5.x due to some u-boot issues last time which were hard to debug too. Maybe we need to go that route though. Others reported same issues in 5.15 though so I had low hopes. But likely one of the next steps...

[1]: See overlay docs here - the sd_force_pio and sd_poll_once params specifically.

agherzan commented 2 years ago

@sahib Is there anyway you can try to reproduce on RaspberryPi OS?

sahib commented 2 years ago

Hmm, would be kinda hard. We have quite a bit of changes in our OS that is required for the apps we run on top of it. Rebuilding this on-top RaspberryPi OS would require considerable work and time. Or put differently: Maybe as desperate action if there's no progress, but not as next step. But yes, I consider it likely that we (or some other layer) did something that differs from RPi OS. My plan was to build kernel 5.4 (+ fixing some stuff) and see if the situation improves.

Edit: Also we noticed that shutting down a few apps also made the freeze less likely, although we could not identify a specific apps causing problems. Data only showed a "the more the merrier" in regards to freezes.

agherzan commented 2 years ago

The only side I'd suspect are firmware, firmware config and kernel. Are you using uboot? That might also have an impact.

sahib commented 2 years ago

We tried different firmware versions already, but to no avail. The firmware config is rather basic, as you can see in the first post. No overclocking and no special settings really. Yes, we do use u-boot from meta-updater-raspberrypi. Where do you see a connection there?

sahib commented 2 years ago

Hmm, it really seems that the device tree overlay I posted above fixed the issue for us. We have only one device doing strange stuff after some days, but we suspect some other (maybe hardware) issue on that one. The symptoms differ a bit (full lockup, not even reacting to sysrq keys, no activity LED) and 5 other devices run fine through the whole time. I'll post again if that changes - really hard to confidently claim the freeze does not happen again.

From some testing we also know that the non-removable part does not seem strictly necessary (although testing time was relatively short), but the brcm,force-pio part is definitely necessary since we had a freeze again after removing this. It is yet unclear if this is a real fix or just makes the freeze less likely due to the slower IO. It probably makes sense to share this info over on the issue tracker for the rpi-kernel and test again when we eventually updated. I'll do this when I have some spare minutes.

While I'm happy this seems fixed, I would still be grateful for any pointers on why forcing PIO seems to fix this.

agherzan commented 2 years ago

This is an interesting find and I remember seeing this set for RPi4 in the main dt. We would need to ask the rpi kernel team for some pointers here. I wouldn't know without debugging this a bit.

agherzan commented 2 years ago

Also, have you seen https://github.com/raspberrypi/linux/issues/1536? Looks similar.

sahib commented 2 years ago

This is an interesting find and I remember seeing this set for RPi4 in the main dt. We would need to ask the rpi kernel team for some pointers here. I wouldn't know without debugging this a bit.

Alright, I still need to create a ticket over there. Got sidetracked a little & hope to get time for that soon.

Also, have you seen https://github.com/raspberrypi/linux/issues/1536? Looks similar.

Thanks for the link. Not exactly sure if related, since their issue is rather binary: either it works or it doesn't.

In the meantime I captured another log of blocked tasks with only non-removable set. They differ quite a lot from the one without non-removable (which had mmc_rescan and only a few other tasks in them). There are a lot of tasks and the memory info at the top seems to indicate that there's enough free pages. A bit of swap was used and applications/tasks using it seem to be stuck completely too. Another bunch of tasks are stuck in rpi_firmware_transaction which is interesting since they should not be blocked by MMC.

freeze-with-non-removable-but-without-force-pio.log

EDIT: In other news: We had some rare freezes still on other devices. Those did not react to SysRq keys and seem to be a different issue altogether.

sahib commented 2 years ago

Posted the issue now on the kernel issue tracker: https://github.com/raspberrypi/linux/issues/5190