Closed EvanCarroll closed 4 years ago
@EvanCarroll does this happen when you are continuously playing sound? Or just when a sound gets played after some period of idleness? Just trying to correlate with other cases, we've seen this panic in other bug reports but haven't triangulated what might cause this. Can you also point us to the results of alsa-info so that we know what platform this happens on? @lgirdwood FYI
Possible duplicate of https://github.com/thesofproject/sof/issues/2828 ?
If SOF is built as a kernel module, removing and reinserting the module should also reset the DSP and get sound back on, for a workaround. No guarantees that it works (for some reason for i.MX there are issues in this flow?)
@slawblauciak @mengdonglin fyi.
I'm also getting hit by this. I usually notice it when trying to use audio (playback or mic) after several hours of not using audio, but today this happened right in the middle of playback. The sound suddenly just went silent and this was in the kernel log:
[263553.357824] sof-audio-pci 0000:00:1f.3: error : DSP panic!
[263553.357837] sof-audio-pci 0000:00:1f.3: status: fw entered - code 00000005
[263553.358092] sof-audio-pci 0000:00:1f.3: error: can't enter idle
[263553.358096] sof-audio-pci 0000:00:1f.3: error: trace point 00004000
[263553.358100] sof-audio-pci 0000:00:1f.3: error: panic at src/lib/agent.c:62
[263553.358104] sof-audio-pci 0000:00:1f.3: error: DSP Firmware Oops
[263553.358109] sof-audio-pci 0000:00:1f.3: EXCCAUSE 0x0000003f EXCVADDR 0x00000000 PS 0x00060725 SAR 0x00000000
[263553.358113] sof-audio-pci 0000:00:1f.3: EPC1 0x00000000 EPC2 0xbe00d1fe EPC3 0x00000000 EPC4 0x00000000
[263553.358117] sof-audio-pci 0000:00:1f.3: EPC5 0x00000000 EPC6 0x00000000 EPC7 0x00000000 DEPC 0x00000000
[263553.358121] sof-audio-pci 0000:00:1f.3: EPS2 0x00060d20 EPS3 0x00000000 EPS4 0x00000000 EPS5 0x00000000
[263553.358124] sof-audio-pci 0000:00:1f.3: EPS6 0x00000000 EPS7 0x00000000 INTENABL 0x00000000 INTERRU 0x00000222
[263553.358127] sof-audio-pci 0000:00:1f.3: stack dump from 0xbe05a110
[263553.358135] sof-audio-pci 0000:00:1f.3: 0xbe05a110: be05a140 00000001 be013760 00000001
[263553.358140] sof-audio-pci 0000:00:1f.3: 0xbe05a114: 00000000 00000000 0000003e be064400
[263553.358145] sof-audio-pci 0000:00:1f.3: 0xbe05a118: b1712600 c11fd8dd 000c0800 00000000
[263553.358149] sof-audio-pci 0000:00:1f.3: 0xbe05a11c: 0dead000 00000000 e14c6018 ffff9e17
[263553.358154] sof-audio-pci 0000:00:1f.3: 0xbe05a120: c1383044 ffffffff e41a3180 ffff9e17
[263553.358158] sof-audio-pci 0000:00:1f.3: 0xbe05a124: b1712600 c11fd8dd 00000000 00000000
[263553.358163] sof-audio-pci 0000:00:1f.3: 0xbe05a128: 004f7c48 ffffaac0 e41a3180 ffff9e17
[263553.358167] sof-audio-pci 0000:00:1f.3: 0xbe05a12c: 00000000 00000000 00000000 00000000
This is on a Thinkpad X1 Carbon 7th using SOF firmware 1.5.1 on Arch linux. Here is alsa-info output:
@ilia-kats I recommend that you move those very large logs into separate files and attach them to your comment (yeah, somehow it is possible to do that). Also, I recommend you open another issue with your specific panic details, so it can be checked. I have edited your comment to keep it in check.
Yikes, another agent panic (DSP load got too high and the agent, a component that keeps the load in check, saw that the DSP was overloaded and stopped everything).
I've updated with the result of alsa-info
. However, I'm more concerned about knowing if recovery is possible than fixing any single issue. This issue however also happens on the X1 Carbon 7th Gen.
@EvanCarroll In theory if you do a suspend operation the DSP context is lost and the firmware re-downloaded. We do not have a recovery in place at the moment, it's been an ask for ages but we never got to it. https://github.com/thesofproject/linux/issues/452
@tlauda Are you aware of any DSP clock speed woes on KBL? Maybe that's why the agent is crying for these two and causing panics...
@EvanCarroll @ilia-kats I'd like some kernel boot logs (dmesg) from both of you to identify the exact system, topology, firmware version etc. It doesn't need to be the very log from the crashes (although it's always better), but they must have the same version (no updates or anything) as with the crash so we can identify it. There are several already known issues with the agent in older versions of the firmware (as far as I know, they've been patched in the current development version) and I'd like to know that it isn't one of those that you're hitting.
I'll provide that next time it crashes, I can't guarantee my firmware hasn't already updated. I use fwupdagent
.
Sure thing. Maybe the update already fixed the issue though. But if it didn't, you're welcome to post all the logs so that I (or, more importantly, the devs that know your platform specifically; I don't know more than some generalities about the Intel platforms) can look into it and identify the reason for the crash. Again, with exact information we can provide a proper solution or workaround. Without it, all we can say is "try doing as root rmmod snd-soc-sof; modprobe snd-soc-sof, and if that fails restart the machine".
It just did it again. I was watching a movie and xscreensaver
kicked in.
I will update the topic with the logs.
The only thing that is half-suspect is "FW ABI is more recent than kernel/topology ABI is more recent than kernel" (kernel is 3:13:0 while FW/topology are 3:16:0). See if you can somehow update the SOF kernel module (if you have a module) or the kernel itself (if it's built in). Not sure if it will actually help but I'd say it's worth a shot.
The FW should no longer panic like that in the upcoming 1.6 release.
Close the bug now. Recovery solution from DSP panic is a big topic and will not be tracked in this bug.
Describe the bug User gets a DSP, not clear how to recover.
To Reproduce Happens sporatically
Reproduction Rate About once in an hour of use.
Expected behavior A clear and concise description of what you expected to happen.
Impact What impact does this issue have on your progress (e.g., annoyance, showstopper)
I am getting a dsp panic,
Then sound stops working and I can't get it back on without restarting. Is there any way to recover from this?
ALSA-INFO output: https://alsa-project.org/db/?f=d6ec2e739dcaa79603877df308a5912269d14995
Output of
sudo dmesg | egrep 'sof|audio'