[CML] rtcwake: aplay task refusing to freeze

marc-hb commented 2 years ago

Observed twice in a row: first on jf-cml-hel-rt5682-01 then on jf-cml-hel-rt5682-02

https://sof-ci.01.org/sofpr/PR5207/build11616/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5 https://sof-ci.01.org/sofpr/PR5207/build11618/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5 https://sof-ci.01.org/sofpr/PR5227/build11700/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5 https://sof-ci.01.org/sofpr/PR5202/build11725/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

check-suspend-resume-with-capture-5.sh passed both times.

Similar issues: #2433 #2561

Start Time: 2022-01-14 02:20:28 UTC End Time: 2022-01-14 02:56:57 UTC Kernel Commit: ba28f9bb SOF Commit: 6b8b1cee16bc

[ 1319.485729] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1319.485779] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1321.647110] kernel: ax88179_178a 2-2.3:1.0 enx000ec66a7f60: ax88179 - Link status is: 1
[ 1323.971977] kernel: PM: suspend entry (s2idle)
[ 1323.972443] kernel: Filesystems sync: 0.000 seconds
[ 1323.976168] kernel: Freezing user space processes ... 
[ 1343.985707] kernel: Freezing of tasks failed after 20.009 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 1343.986049] kernel: task:aplay           state:D stack:12928 pid:23088 ppid: 23012 flags:0x00000004
[ 1343.986069] kernel: Call Trace:
[ 1343.986074] kernel:  
[ 1343.986084] kernel:  __schedule+0x2ac/0x850
[ 1343.986109] kernel:  schedule+0x49/0xc0
[ 1343.986120] kernel:  schedule_timeout+0x1d5/0x340
[ 1343.986130] kernel:  ? del_timer_sync+0x80/0x80
[ 1343.986155] kernel:  snd_power_ref_and_wait+0xba/0x110 [snd]
[ 1343.986182] kernel:  ? wake_up_q+0x80/0x80
[ 1343.986199] kernel:  snd_power_wait+0xd/0x50 [snd]
[ 1343.986222] kernel:  snd_pcm_common_ioctl+0x71/0x11c0 [snd_pcm]
[ 1343.986254] kernel:  ? selinux_file_ioctl+0x16f/0x210
[ 1343.986269] kernel:  ? lock_release+0x14b/0x3f0
[ 1343.986287] kernel:  snd_pcm_ioctl+0x1e/0x30 [snd_pcm]
[ 1343.986310] kernel:  __x64_sys_ioctl+0x7b/0xb0
[ 1343.986326] kernel:  do_syscall_64+0x38/0x90
[ 1343.986335] kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1343.986345] kernel: RIP: 0033:0x7fb3b1f1350b
[ 1343.986352] kernel: RSP: 002b:00007ffcd8781b38 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1343.986362] kernel: RAX: ffffffffffffffda RBX: 000055aa695caef0 RCX: 00007fb3b1f1350b
[ 1343.986367] kernel: RDX: 00007ffcd8781b40 RSI: 0000000040184150 RDI: 0000000000000004
[ 1343.986372] kernel: RBP: 000055aa695cae70 R08: 0000000000000000 R09: 000055aa695cc120
[ 1343.986378] kernel: R10: 000055aa695b2010 R11: 0000000000000246 R12: 000055aa695cc100
[ 1343.986382] kernel: R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000001000
[ 1343.986402] kernel:  
[ 1343.986411] kernel: 
[ 1343.986414] kernel: OOM killer enabled.
[ 1343.986417] kernel: Restarting tasks ... done.
[ 1343.994689] kernel: PM: suspend exit
[ 1343.999179] kernel: rfkill: input handler enabled
[ 1344.053935] kernel: rfkill: input handler disabled
[ 1345.375290] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x90040000: GLB_TRACE_MSG: FILTER_UPDATE
[ 1345.375536] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x90040000: GLB_TRACE_MSG: FILTER_UPDATE

plbossart commented 2 years ago

also seen in https://sof-ci.01.org/linuxpr/PR3338/build6982/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

this could be the result of an upstream merge. The call trace is new but in the past when we had rtcwake errors they seemed to be 100% correlated with this 'task refusing to freeze' issue.

plbossart commented 2 years ago

yeah, this is not a new error, it's just that we can see it Commit 8412dbd642585107e29b20920876de98e7df5819 ('PM / sleep: Show freezing tasks that caused a suspend abort') added more information.

from the commit message it seems we have a task that is still freezing while we get a wake-up, a possible indication of a race condition.

This is excellent that we can trap this instead of seeing rtcwake fail miserably.

plbossart commented 2 years ago

If I look at the call trace above, it seems the 'aplay' userspace task initiates an ioctl while we are trying to freeze all tasks and suspend, and that ioctl will try to bring the card back in D0 state.

that looks like a race condition to me, and I am not sure why a userspace task is permitted to do this.

if I look at snd_power_ref_and_wait() it seems there's a possibility of an infinite loop where we never return anything. Going back to D0 state could take some time, typically hundreds of ms, so maybe this happens because we report an error freezing a task because a power-related task is in progress.

@tiwai @perexg I could use your guidance on this. Not sure what is happening here.

tiwai commented 2 years ago

Hrm, weird. It's the place where suspend_prepare() calling the freezer, right? If so, the card power state must not be other than D0 at that point, as it's before the actual device suspend procedure takes place. (For ASoC, soc_suspend() calls snd_power_change_state() with D3hot, and this should be called after the freezer.)

The task flags (0x04) shows PF_EXITING, so this is a process being shut down, as it seems. But it still doesn't clarify why the loop for power D0 matters... We need to check whether it's really in that loop waiting for the power state D0 or not.

tiwai commented 2 years ago

Does it have any influence if we replace wake_up(&card->power_sleep) with wake_up_all(&card->power_sleep) in snd_power_change_state() defined in include/sound/core.h?

tiwai commented 2 years ago

Or a patch like below might help? 0001-ALSA-core-Simplify-sndpower-with-the-standard-macr.patch.txt

plbossart commented 2 years ago

@tiwai I don't fully follow your explanations - suspend_prepare doesn't seem to be part of sound/ and not clear either on what ASoC does with D3hot. I clearly don't have any understanding of the freeze part.

I added a simple check to trace when we start the loop to go back to D0, we'll see if this confirms the analysis. https://github.com/thesofproject/linux/pull/3374

tiwai commented 2 years ago

Yes, that'll be helpful for understanding the situation. Also, it'd be good to have debug prints at snd_power_change_state(), too.

plbossart commented 2 years ago

it seems we have a 100% reproducible case on a Chromebook, see https://sof-ci.01.org/linuxpr/PR3374/build6987/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

dmesg.log

[ 1080.546679] kernel: Restarting tasks ... 
[ 1080.549004] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting loop, power state 512
[ 1080.555947] kernel: done.
[ 1080.558166] kernel: PM: suspend exit
[ 1080.560144] kernel: rfkill: input handler enabled
[ 1080.576412] kernel: rfkill: input handler disabled
[ 1080.593997] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1080.594003] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1081.040086] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1081.040123] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1081.540251] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1081.540293] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1085.770336] kernel: PM: suspend entry (s2idle)
[ 1085.771523] kernel: Filesystems sync: 0.001 seconds
[ 1085.773382] kernel: Freezing user space processes ... 
[ 1105.774200] kernel: Freezing of tasks failed after 20.005 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 1105.774508] kernel: task:aplay           state:D stack:12928 pid:22896 ppid: 22820 flags:0x00000004
[ 1105.774526] kernel: Call Trace:
[ 1105.774530] kernel:  
[ 1105.774540] kernel:  __schedule+0x2ac/0x850
[ 1105.774562] kernel:  schedule+0x49/0xc0
[ 1105.774572] kernel:  schedule_timeout+0x1d5/0x340
[ 1105.774581] kernel:  ? del_timer_sync+0x80/0x80
[ 1105.774604] kernel:  snd_power_ref_and_wait+0xdd/0x130 [snd]
[ 1105.774626] kernel:  ? wake_up_q+0x80/0x80
[ 1105.774641] kernel:  snd_power_wait+0xd/0x50 [snd]
[ 1105.774663] kernel:  snd_pcm_common_ioctl+0x71/0x11c0 [snd_pcm]
[ 1105.774691] kernel:  ? selinux_file_ioctl+0x16f/0x210
[ 1105.774704] kernel:  ? lock_release+0x14b/0x3f0
[ 1105.774721] kernel:  snd_pcm_ioctl+0x1e/0x30 [snd_pcm]
[ 1105.774742] kernel:  __x64_sys_ioctl+0x7b/0xb0
[ 1105.774756] kernel:  do_syscall_64+0x38/0x90
[ 1105.774764] kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 1105.774773] kernel: RIP: 0033:0x7f873608b50b
[ 1105.774780] kernel: RSP: 002b:00007fff07897bb8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1105.774788] kernel: RAX: ffffffffffffffda RBX: 0000559a3302bef0 RCX: 00007f873608b50b
[ 1105.774793] kernel: RDX: 00007fff07897bc0 RSI: 0000000040184150 RDI: 0000000000000004
[ 1105.774797] kernel: RBP: 0000559a3302be70 R08: 0000000000000000 R09: 0000559a3302d120
[ 1105.774802] kernel: R10: 0000559a33013010 R11: 0000000000000246 R12: 0000559a3302d100
[ 1105.774806] kernel: R13: 0000000000000010 R14: 0000000000000000 R15: 0000000000001000

This looks like a sequence of suspend-resumes where we try to restore the D0 state when resuming but somehow never reach that state, and we suspend with an active thread?

tiwai commented 2 years ago

ASoC does the resume in a deferred work, and that might be the reason? The work is synced at snd_soc_suspend(), but it's called after user process freezing, hence it's too late.

plbossart commented 2 years ago

Thanks @tiwai. Would it change anything if we did the workqueue sync in a .prepare callback?

tiwai commented 2 years ago

I guess this won't fit,the PM prepare callback is handled at a later stage, IIRC. But I might be wrong, better to test actually, of course.

BTW, could you check whether the patch I attached in the above makes any difference?

plbossart commented 2 years ago

I added the traces in snd_power_change_state, I don't really understand the results though

https://sof-ci.01.org/linuxpr/PR3374/build6988/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

[ 1076.188123] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: requesting state 512 in snd_power_change_state
[ 1076.188130] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: waking up power_sleep in snd_power_change_state
[ 1076.188133] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: done in snd_power_change_state
[ 1076.189413] kernel: OOM killer enabled.
[ 1076.189416] kernel: Restarting tasks ... 
[ 1076.191592] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting loop, power state 512
[ 1076.196926] kernel: done.
[ 1076.207334] kernel: PM: suspend exit
[ 1076.211122] kernel: rfkill: input handler enabled
[ 1076.224852] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1076.224859] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1076.231123] kernel: rfkill: input handler disabled
[ 1076.681421] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1076.681462] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1077.181638] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1077.181680] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1081.422331] kernel: PM: suspend entry (s2idle)
[ 1081.422860] kernel: Filesystems sync: 0.000 seconds
[ 1081.423863] kernel: Freezing user space processes ... 
[ 1101.437333] kernel: Freezing of tasks failed after 20.007 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 1101.437674] kernel: task:aplay           state:D stack:12968 pid:22894 ppid: 22818 flags:0x00000004

I will give it a try with the .prepare step. prepare is supposed to be before any suspend/freeze. I will test your patch later today.

plbossart commented 2 years ago

@tiwai I added a .prepare and that's not helping. https://sof-ci.01.org/linuxpr/PR3374/build6991/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5 shows we have exactly the same error before the .prepare is invoked.

I'll try your patch now.

plbossart commented 2 years ago

@tiwai unfortunately your patch doesn't seem to improve the results. It's the same issue that we wait for the device to go back to D0 and start a suspend before the resume is complete. You can see in https://github.com/thesofproject/linux/pull/3374/commits/5001794c5eff06fa5ad56b490676f6c68eb7bee5 that there are two traces, and only the first is found below. the log to indicate the end of the wait is not coming out.

[ 1083.637373] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: requesting state 512 in snd_power_change_state
[ 1083.637380] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: waking up power_sleep in snd_power_change_state
[ 1083.637384] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: done in snd_power_change_state
[ 1083.638706] kernel: OOM killer enabled.
[ 1083.638709] kernel: Restarting tasks ... 
[ 1083.645772] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[ 1083.649936] kernel: done.
[ 1083.657878] kernel: rfkill: input handler enabled
[ 1083.662982] kernel: PM: suspend exit
[ 1083.677169] kernel: rfkill: input handler disabled
[ 1083.685855] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1083.685860] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1084.129376] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1084.129428] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1084.629345] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1084.629396] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1088.871217] kernel: PM: suspend entry (s2idle)
[ 1088.873518] kernel: Filesystems sync: 0.002 seconds
[ 1088.875950] kernel: Freezing user space processes ... 
[ 1108.879902] kernel: Freezing of tasks failed after 20.008 seconds (1 tasks refusing to freeze, wq_busy=0):
[ 1108.880179] kernel: task:aplay           state:D stack:12968 pid:22895 ppid: 22819 flags:0x00000004
[ 1108.880195] kernel: Call Trace:
[ 1108.880199] kernel:  
[ 1108.880208] kernel:  __schedule+0x2ac/0x850
[ 1108.880229] kernel:  schedule+0x49/0xc0
[ 1108.880242] kernel:  snd_power_ref_and_wait.cold+0xdb/0xe1 [snd]
[ 1108.880266] kernel:  ? wait_woken+0x70/0x70
[ 1108.880277] kernel:  snd_power_wait+0xd/0x50 [snd]
[ 1108.880298] kernel:  snd_pcm_common_ioctl+0x71/0x11c0 [snd_pcm]
[ 1108.880325] kernel:  ? selinux_file_ioctl+0x16f/0x210
[ 1108.880338] kernel:  ? xfd_validate_state+0x18/0x70
[ 1108.880354] kernel:  snd_pcm_ioctl+0x1e/0x30 [snd_pcm]

plbossart commented 2 years ago

https://github.com/thesofproject/linux/pull/3376 removes the workqueue entirely. Let's see what happens.

plbossart commented 2 years ago

https://sof-ci.01.org/linuxpr/PR3376/build6994/devicetest/ shows no issues when ASoC resumes synchronously. Humm @lgirdwood will need your help on this one :-)

tiwai commented 2 years ago

@plbossart Thanks for checking. My patch would be still good as a cleanup, so I'm going to submit in anyway with a slight rephrasing.

For syncing the deferred resume work, we might need to use the pm notifier. Could you try the patch below? soc-deferred-resume-notify.diff.txt

ujfalusi commented 2 years ago

It fails right at the second iteration. Does the dmesg contains the first iteration's log? I was looking at the passing platforms and the logs are kind of short for the 5 test (more like one round).

In the log I can see the suspend from the previous step along with the wakeup. What makes it interesting is that we don't have the pcm: trigger stream 0 dir 0 cmd 1 after the wakeup (from the first suspend). I have tested locally and it is always coming and I don't have similar issue.

The other interesting thing is that in a working cases when we wake up we have the following ipc sequence: ipc rx: 0x70000000: FW_READY ipc tx: 0x90030000: GLB_TRACE_MSG: DMA_PARAMS_EXT ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION ... GLB_TPLG_MSG messages

In the failing logs: ipc rx: 0x70000000: FW_READY ipc tx: 0x90030000: GLB_TRACE_MSG: DMA_PARAMS_EXT ipc tx: 0x30100000: GLB_TPLG_MSG: PIPE_NEW ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION ... GLB_TPLG_MSG messages ... ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE

I don't know why we are missing the trigger start on the failing case, but can that be part of the puzzle? The resume sequence also of concern, but I can not figure out why it is like that.

tiwai commented 2 years ago

@ujfalusi The issue is likely because the suspend is invoked before the deferred resume work is completed. The suspend procedure is to perform freezer at first, then process driver's suspend callbacks. And the problem is at the freezer stage, where the task seems recognized as unfreezable. So my test patch above tries to finish the deferred resume work before the freezer stage.

On the second thought, the primary problem is that the aplay task is recognized as unfreezable, although it's sticking in schedule(). The likely cause is that it's an UNINTERRUPTIBLE sleep, and there is a good reason for that; we don't want to allow user-space proceeding before the card's resume finishes.

And I looked through the code and found a possible (but hackish) workaround using lock_system_sleep(). Another test patch is below. It's less-intrusive, so this might be taken more easily, once after we confirm that it works. snd-power-lock-ignore-freezer.diff.txt

tiwai commented 2 years ago

Scratch my previous patch. On the third thought, it may cause another problem; namely, when a deferred resume finishes at snd_soc_suspend(), it wakes up the sleeping task in snd_power_wait*() unexpectedly after the freezer phase. Although the wake-up might be not processed yet at that point, it's a flaky operation...

So syncing with PM notifier might be the way to go in the end.

plbossart commented 2 years ago

I tried you patch @tiwai but something doesn't seem right, the flush doesn't complete

[   49.167567] cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: requesting state 512 in snd_power_change_state
[   49.167578] cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: waking up power_sleep in snd_power_change_state
[   49.167584] cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: done in snd_power_change_state
[   49.220919] rt5682 i2c-10EC5682:00: div[0]=1
[   49.220931] rt5682 i2c-10EC5682:00: div[1]=2
[   49.295851] OOM killer enabled.
[   49.295858] Restarting tasks ... done.
[   49.301875] cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[   49.303735] plb: soc_resume_notify: start, action 4
[   49.303739] plb: soc_resume_notify: done
[   49.304002] PM: suspend exit
[   49.308326] rfkill: input handler enabled
[   49.379127] rfkill: input handler disabled
[   49.664720] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[   49.664777] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[   50.164387] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[   50.164437] sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[   52.317621] ax88179_178a 2-2.3:1.0 enx000ec66a7f60: ax88179 - Link status is: 1
[   54.853000] PM: suspend entry (s2idle)
[   54.854956] Filesystems sync: 0.001 seconds
[   54.865312] plb: soc_resume_notify: start, action 3
[   54.865328] plb: soc_resume_notify: before flush <<< STUCK PAST THIS POINT

I try adding more traces but it's just awful to debug this remotely, I have to get a local device to debug this further.

tiwai commented 2 years ago

OK, then could you try to replace flush_work(&card->deferred_resume_work) with snd_power_wait(card->snd_card) in the patch? soc-deferred-resume-notify-v2.diff.txt

plbossart commented 2 years ago

Not much better, the traces I added show the resume does not complete. it seems like we have a DAPM operation that is stuck.

    pr_warn("plb: %s: 7\n", __func__); << last trace seen

    /* Recheck all endpoints too, their state is affected by suspend */
    dapm_mark_endpoints_dirty(card);

    pr_warn("plb: %s: 7.5\n", __func__);

    snd_soc_dapm_sync(&card->dapm);

    pr_warn("plb: %s: 8\n", __func__); <<< not reached.

I added one more trace to see what happens

https://sof-ci.01.org/linuxpr/PR3374/build7002/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

plbossart commented 2 years ago

https://sof-ci.01.org/linuxpr/PR3374/build7007/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5 shows that snd_soc_dapm_sync() never completes.

[ 1307.727716] kernel: plb: soc_resume_deferred: start
[ 1307.727721] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: requesting state 512 in snd_power_change_state
[ 1307.727725] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: waking up power_sleep in snd_power_change_state
[ 1307.727729] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: done in snd_power_change_state
[ 1307.727731] kernel: plb: soc_resume_deferred: 1
[ 1307.727733] kernel: plb: soc_resume_deferred: 2
[ 1307.768488] kernel: plb: soc_resume_deferred: 3
[ 1307.768848] kernel: plb: soc_resume_deferred: 4
[ 1307.768853] kernel: plb: soc_resume_deferred: 5
[ 1307.768855] kernel: plb: soc_resume_deferred: 6
[ 1307.768856] kernel: plb: soc_resume_deferred: 7
[ 1307.768878] kernel: plb: soc_resume_deferred: 7.5 <<< not completing.
[ 1307.770925] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1307.770933] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1308.011582] kernel: OOM killer enabled.
[ 1308.011586] kernel: Restarting tasks ... done.
[ 1308.018086] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[ 1308.018864] kernel: plb: soc_resume_notify: start, action 4
[ 1308.018868] kernel: plb: soc_resume_notify: done
[ 1308.018928] kernel: PM: suspend exit
[ 1308.032658] kernel: rfkill: input handler enabled
[ 1308.090662] kernel: rfkill: input handler disabled
[ 1308.225192] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1308.225231] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1308.725171] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1308.725212] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1311.001966] kernel: ax88179_178a 2-2.3:1.0 enx000ec66a7cea: ax88179 - Link status is: 1
[ 1313.252726] kernel: PM: suspend entry (s2idle)
[ 1313.255093] kernel: Filesystems sync: 0.002 seconds
[ 1313.266913] kernel: plb: soc_resume_notify: start, action 3
[ 1313.266923] kernel: plb: soc_resume_notify: before snd_power_wait
[ 1313.266930] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[ 1313.281005] kernel: rfkill: input handler enabled
[ 2801.789901] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512

I have no idea why this might be the case.

tiwai commented 2 years ago

Hrm, that's the place using async execution, and it might be the cause.

Then, let's try go back to another approach I scratched. Below is a revised patch, adding a workaround for the resume-in-suspend problem. snd-power-lock-ignore-freezer-v2.diff.txt

I guess this will hit the similar stall, though. That is, the main cause might be rather the DAPM sync stall in the deferred resume.

ujfalusi commented 2 years ago

@tiwai, @plbossart, this might or might not help to see the DAPM sequence?

diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index 434e61b46983..099234a26fd4 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -2312,6 +2312,7 @@ int snd_soc_register_card(struct snd_soc_card *card)
    INIT_LIST_HEAD(&card->dobj_list);

    card->instantiated = 0;
+   card->pop_time = 1;
    mutex_init(&card->mutex);
    mutex_init(&card->dapm_mutex);
    mutex_init(&card->pcm_mutex);

plbossart commented 2 years ago

Hrm, that's the place using async execution, and it might be the cause.

Then, let's try go back to another approach I scratched. Below is a revised patch, adding a workaround for the resume-in-suspend problem. snd-power-lock-ignore-freezer-v2.diff.txt

I guess this will hit the similar stall, though. That is, the main cause might be rather the DAPM sync stall in the deferred resume.

I'll give it a try, but I wonder if the patch fixes the wrong problem? the issue is try to suspend while resume is not complete. it's not a 'resume-in-suspend' case, or maybe I misunderstood your wording.

tiwai commented 2 years ago

At least we can see whether it's in the core code or not. This patch will make the freezer ignoring those aplay process in the power-check loop, and proceed further. If the things work with that, indeed that's an issue about the process handling. OTOH, if the problem persists (it's more likely, I guess), it implies that the primary cause is rather the device itself that doesn't finish the resume properly at the second time.

Would the device resume properly at the second time if you don't go to suspend?

plbossart commented 2 years ago

it looks like the DAPM issue remains, it looks like no matter what we try we end-up not finishing the resume before we suspend.

https://sof-ci.01.org/linuxpr/PR3374/build7012/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

[ 1076.123368] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1076.123424] kernel: plb: soc_resume_deferred: start
[ 1076.123429] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: requesting state 512 in snd_power_change_state
[ 1076.123433] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: waking up power_sleep in snd_power_change_state
[ 1076.123436] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_change_state: done in snd_power_change_state
[ 1076.123439] kernel: plb: soc_resume_deferred: 1
[ 1076.123440] kernel: plb: soc_resume_deferred: 2
[ 1076.124692] kernel: OOM killer enabled.
[ 1076.124694] kernel: Restarting tasks ... 
[ 1076.127252] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[ 1076.129039] kernel: done.
[ 1076.131905] kernel: PM: suspend exit
[ 1076.142481] kernel: rfkill: input handler enabled
[ 1076.156187] kernel: rfkill: input handler disabled
[ 1076.165710] kernel: plb: soc_resume_deferred: 3
[ 1076.165932] kernel: plb: soc_resume_deferred: 4
[ 1076.165936] kernel: plb: soc_resume_deferred: 5
[ 1076.165938] kernel: plb: soc_resume_deferred: 6
[ 1076.165939] kernel: plb: soc_resume_deferred: 7
[ 1076.165956] kernel: plb: soc_resume_deferred: 7.5
[ 1076.166977] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1076.166982] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1076.616650] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1076.616694] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1077.116984] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1077.117027] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc rx done: 0x90020000: GLB_TRACE_MSG: DMA_POSITION
[ 1081.363503] kernel: PM: suspend entry (s2idle)
[ 1081.363509] kernel: PM: suspend exit

plbossart commented 2 years ago

With more traces, the resume can be shown as getting stuck in dapm_seq_run()

    pr_warn("plb: %s: 10\n", __func__);  <<< reached

    /* Now power up. */
    dapm_seq_run(card, &up_list, event, true);

    pr_warn("plb: %s: 11\n", __func__); <<< not reached

https://sof-ci.01.org/linuxpr/PR3374/build7014/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

[ 1301.115225] kernel: plb: dapm_power_widgets: 9
[ 1301.115232] kernel: plb: dapm_power_widgets: 10
[ 1301.115648] kernel: rt5682 i2c-10EC5682:00: div[0]=1
[ 1301.115660] kernel: rt5682 i2c-10EC5682:00: div[1]=2
[ 1301.326013] kernel: OOM killer enabled.
[ 1301.326019] kernel: Restarting tasks ... 
[ 1301.329894] kernel: cml_rt1011_rt5682 cml_rt1011_rt5682: plb: snd_power_ref_and_wait: starting wait, power state 512
[ 1301.331519] kernel: done.
[ 1301.333441] kernel: PM: suspend exit

ujfalusi commented 2 years ago

Curiously with DAPM pop debugging (pop_time = 1) the CI passes, but we have different fails (which might be unrelated FS corruption?)

https://sof-ci.01.org/linuxpr/PR3384/build7028/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

PR: https://github.com/thesofproject/linux/pull/3384

plbossart commented 2 years ago

@ujfalusi we've seen those 'hung kernel tasks' errors before, I think they appeared in the last kernel cycle - I filed a new bug since it's not tracked for some reason by our CI folks https://github.com/thesofproject/linux/issues/3387

plbossart commented 2 years ago

@ujfalusi for my education, what does pop_time do? How would it change the DAPM behavior?

ujfalusi commented 2 years ago

@ujfalusi for my education, what does pop_time do? How would it change the DAPM behavior?

@plbossart, it does two things, unfortunately,

we can inject delays between DAPM steps. This is really useful for figuring out pops and clicks by slowing down DAPM.
it prints the DAPM event flow which is really useful for figuring out sequencing issues.

With pop_time = 1 we add 1 jiffies long uninterruptible sleep essentially, which can interfere for sure. We could add a hack patch to skip the sleep if pop_sleep is 1.

Let me try that...

ujfalusi commented 2 years ago

@plbossart, now it looks more promising (information wise): https://sof-ci.01.org/linuxpr/PR3384/build7031/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

The last dapm event on resume before the suspend is:

[ 1304.376201] kernel: rt5682 i2c-10EC5682:00: pop test : HP Amp PRE_PMU

end we do not get to the POST_PMU side of the widget...

plbossart commented 2 years ago

thanks @ujfalusi, could this be a deadlock in the rt5682 driver:

static int rt5682_hp_event(struct snd_soc_dapm_widget *w,
        struct snd_kcontrol *kcontrol, int event)
{
    struct snd_soc_component *component =
        snd_soc_dapm_to_component(w->dapm);
    struct rt5682_priv *rt5682 = snd_soc_component_get_drvdata(component);

    switch (event) {
    case SND_SOC_DAPM_PRE_PMU:
        snd_soc_component_update_bits(component, RT5682_HP_CTRL_2,
            RT5682_HP_C2_DAC_AMP_MUTE, 0);
        snd_soc_component_update_bits(component, RT5682_HP_LOGIC_CTRL_2,
            RT5682_HP_LC2_SIG_SOUR2_MASK, RT5682_HP_LC2_SIG_SOUR2_REG);
        snd_soc_component_update_bits(component,
            RT5682_DEPOP_1, 0x60, 0x60);
        snd_soc_component_update_bits(component,
            RT5682_DAC_ADC_DIG_VOL1, 0x00c0, 0x0080);

        mutex_lock(&rt5682->jdet_mutex); <<< STUCK HERE?

        snd_soc_component_update_bits(component, RT5682_HP_CTRL_2,
            RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN,
            RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN);
        usleep_range(5000, 10000);
        snd_soc_component_update_bits(component, RT5682_CHARGE_PUMP_1,
            RT5682_CP_SW_SIZE_MASK, RT5682_CP_SW_SIZE_L);

        mutex_unlock(&rt5682->jdet_mutex);
        break;

tiwai commented 2 years ago

That's my suspect, too. You can try like this one. rt5682-jack-mutex-test.diff.txt

ujfalusi commented 2 years ago

The work does not have mutex used, but it is certainly there. I have pushed a patch to track the mutex uses in the rt5682 driver.

ujfalusi commented 2 years ago

From the logs:

[   35.396923] kernel: LDP: rt5682_jack_detect_handler (sound/soc/codecs/rt5682.c, 1105): ENTER (jdet_mutex)
[   35.396928] kernel: LDP: rt5682_jack_detect_handler (sound/soc/codecs/rt5682.c, 1107): ENTER (calibrate_mutex)
...
[   35.423677] kernel: rt5682 i2c-10EC5682:00: pop test : Queue HP Amp: reg=0xffffffff, 0x1/0x1
[   35.423681] kernel: rt5682 i2c-10EC5682:00: pop test : HP Amp PRE_PMU
[   35.424480] kernel: LDP: rt5682_hp_event (sound/soc/codecs/rt5682.c, 1546): ENTER (jdet_mutex)

https://sof-ci.01.org/linuxpr/PR3384/build7033/devicetest/?model=CML_HEL_RT5682&testcase=check-suspend-resume-with-playback-5

Trying the patch from @tiwai, next, keeping the prints, but this time they are pr_warn()

ujfalusi commented 2 years ago

The lock on resume is introduced by: 8deb34a90f06374fd26f722c2a79e15160f66be7 ("ASoC: rt5682: fix the wrong jack type detected")

The lock happens this way: A rt5682_resume() -> mod_delayed_work() for jack_detect_work B DAPM sequence starts ( DAPM is locked now) A1. rt5682_jack_detect_handler() scheduled Takes both jdet_mutex and calibrate_mutex Calls in to rt5682_headset_detect() which tries to take DAPM lock, it starts to wait for it as B path took it already. B1. DAPM sequence reaches teh "HP Amp", rt5682_hp_event() tries to take the jdet_mutex, but it is locked in A1, so it waits.

Deadlock.

marc-hb commented 2 years ago

Watching this investigation has been more interesting than some detective shows. However I wonder if there is with hindsight some tool and/or approach that could have found this deadlock faster? lockdep or something? It would be great to save all the experts involved a lot of time the next time.

ujfalusi commented 2 years ago

@marc-hb, CONFIG_DEBUG_LOCKDEP for sure would have been pointing to this earlier. Don't we have it enabled for CI?

ujfalusi commented 2 years ago

@tiwai, @plbossart, reverting said commit (8deb34a90f06374fd26f722c2a79e15160f66be7) is solving the issue: https://github.com/thesofproject/linux/pull/3390

Looking at the code, it is not that trivial to fix it with the three locks around, I have tried to drop the jdet_work and mutex and use the rt5682_jack_detect_handler() only, but it caused some race on module removal (the work rescheduled after cancel_delayed_work_sync(), another mutex + a flag might help, but it starts to look ugly.

tiwai commented 2 years ago

@ujfalusi Thanks for tracking this. Yes, that looks like a typical ABBA deadlock. I guess we may simply drop jdet_mutex, and replace with dapm mutex?

Also, as seen my fix attempt, cancel_delayed_work_sync() should be outside the mutex that is taken by the worker itself. Otherwise the call would be another cause of deadlock.

plbossart commented 2 years ago

@marc-hb, CONFIG_DEBUG_LOCKDEP for sure would have been pointing to this earlier. Don't we have it enabled for CI?

we have this enabled:

CONFIG_LOCKDEP_SUPPORT=y
CONFIG_LOCKDEP=y
CONFIG_LOCKDEP_BITS=15
CONFIG_LOCKDEP_CHAINS_BITS=16
CONFIG_LOCKDEP_STACK_TRACE_BITS=19
CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS=14
CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS=12
# CONFIG_DEBUG_LOCKDEP is not set

The latter says 'Lock dependency engine debugging', I am not sure if this would help. We want to detect the lock, not detect issues in the locking detection. It's straightforward to add if needed.

plbossart commented 2 years ago

Adding Realtek team; @oder-chiou @shumingfan @jack-cy-yu and @bardliao FYI.

ujfalusi commented 2 years ago

@tiwai, @plbossart, it looks like I have foujd a way to fix it w/o revert: #3392

Would you take a look if you can spot if I'm doing it wrong?

tiwai commented 2 years ago

@ujfalusi LGTM

marc-hb commented 2 years ago

I see that #3392 was merged on Jan 28 but with kernel 5.17.0-rc1-daily-default-20220202-0-13669-gbb45f689fa62 I found this DEBUG_LOCKS_WARN_ON(lock->magic != lock) by chance in the logs while looking for something else:

Feb 03 05:37:33 jf-tglu-volt-sdw-01 /usr/lib/gdm3/gdm-x-session[624]: (II) xfree86: Adding drm device (/dev/dri/card0)
Feb 03 05:37:33 jf-tglu-volt-sdw-01 /usr/lib/gdm3/gdm-x-session[624]: (II) systemd-logind: got fd for /dev/dri/card0 226:0 fd 12 paused 0
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: warn: topology ABI is more recent than kernel
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: ASoC: Parent card not yet available, widget card binding deferred
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: hda_dsp_hdmi_build_controls: no PCM in topology for HDMI converter 4
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: hda_dsp_hdmi_build_controls: no PCM in topology for HDMI converter 5
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: hda_dsp_hdmi_build_controls: no PCM in topology for HDMI converter 6
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: hda_dsp_hdmi_build_controls: no PCM in topology for HDMI converter 7
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: sof_sdw sof_sdw: hda_dsp_hdmi_build_controls: no PCM in topology for HDMI converter 8
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: ------------[ cut here ]------------
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: DEBUG_LOCKS_WARN_ON(lock->magic != lock)
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: WARNING: CPU: 3 PID: 186 at kernel/locking/mutex.c:579 __mutex_lock+0x8ee/0x910
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: Modules linked in: fuse snd_soc_sof_sdw snd_soc_intel_hda_dsp_common snd_soc_intel_sof_maxim_common snd_usb_audio snd_usbmidi_lib snd_h>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  int340x_thermal_zone int3400_thermal acpi_thermal_rel intel_pmc_core squashfs drm drm_panel_orientation_quirks efivarfs spi_pxa2xx_pla>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: CPU: 3 PID: 186 Comm: kworker/3:2 Not tainted 5.17.0-rc1-daily-default-20220202-0-13669-gbb45f689fa62 #2
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: Hardware name: Google Volteer/Volteer, BIOS  06/30/2020
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: Workqueue: events_power_efficient rt5682_jack_detect_handler [snd_soc_rt5682]
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: RIP: 0010:__mutex_lock+0x8ee/0x910
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: Code: c0 0f 84 a0 f7 ff ff 44 8b 1d d6 fe e4 00 45 85 db 0f 85 90 f7 ff ff 48 c7 c6 77 ee fa a4 48 c7 c7 d2 b4 f9 a4 e8 d5 71 f9 ff <0f>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: RSP: 0018:ffffa0db80cffd70 EFLAGS: 00010282
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: RDX: 0000000000000001 RSI: ffffffffa4fbb461 RDI: 00000000ffffffff
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: RBP: ffffa0db80cffe10 R08: 0000000000000000 R09: ffffa0db80cffb98
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: R13: ffff953f51f61c40 R14: 0000000000000000 R15: ffff9542af9efd05
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: FS:  0000000000000000(0000) GS:ffff9542af800000(0000) knlGS:0000000000000000
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: CR2: 00007fd326f2a220 CR3: 0000000105162005 CR4: 0000000000770ee0
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: PKRU: 55555554
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: Call Trace:
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  <TASK>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ? sched_clock_cpu+0xd/0xd0
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ? rt5682_jack_detect_handler+0x84/0x2e0 [snd_soc_rt5682]
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ? rt5682_jack_detect_handler+0x84/0x2e0 [snd_soc_rt5682]
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  rt5682_jack_detect_handler+0x84/0x2e0 [snd_soc_rt5682]
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  process_one_work+0x297/0x530
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  worker_thread+0x48/0x3c0
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ? process_one_work+0x530/0x530
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  kthread+0xeb/0x120
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ? kthread_complete_and_exit+0x20/0x20
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  ret_from_fork+0x1f/0x30
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel:  </TASK>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 kernel: ---[ end trace 0000000000000000 ]---
Feb 03 05:37:33 jf-tglu-volt-sdw-01 /usr/lib/gdm3/gdm-x-session[624]: (--) PCI:*(0@0:2:0) 8086:9a49:8086:9a49 rev 0, Mem @ 0x90000000/16777216, 0x80000000/268435456, I/O @ 0x00001>
Feb 03 05:37:33 jf-tglu-volt-sdw-01 /usr/lib/gdm3/gdm-x-session[624]: (II) LoadModule: "glx"
Feb 03 05:37:33 jf-tglu-volt-sdw-01 /usr/lib/gdm3/gdm-x-session[624]: (II) Loading /usr/lib/xorg/modules/extensions/libglx.so

thesofproject / linux

[CML] rtcwake: aplay task refusing to freeze #3370