thesofproject / linux

Linux kernel source tree
Other
91 stars 134 forks source link

Merge/sound upstream 20241009 #5204

Closed bardliao closed 1 month ago

bardliao commented 1 month ago

@vijendarmukunda Could you double check the AMD part? I fixed some conflicts. Not sure if I did it right.

vijendarmukunda commented 1 month ago

@bardliao : could you please point me commits which you have resolved the merge conflicts?

bardliao commented 1 month ago

@bardliao : could you please point me commits which you have resolved the merge conflicts?

It is https://github.com/bardliao/linux/commit/2ffd47d69b9c425b7781f8d7139fc3a4bbd8a841. The conflict is in sound/soc/amd/acp/acp-sdw-sof-mach.c.

bardliao commented 1 month ago

SOFCI TEST

ujfalusi commented 1 month ago

stable-2.2 has regression: platform cml_rt5682_def: deferred probe pending: (reason unknown) platform glk_da7219_def: deferred probe pending: (reason unknown)

and a new (for me) on LNL:

[  853.302082] kernel: snd_sof:sof_pcm_trigger: sof-audio-pci-intel-lnl 0000:00:1f.3: pcm: trigger stream 7 dir 0 cmd 1
[  853.302087] kernel: snd_sof:sof_ipc4_trigger_pipelines: sof-audio-pci-intel-lnl 0000:00:1f.3: trigger cmd: 1 state: 4
[  853.302092] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx      : 0xe070001|0x180: GLB_CHAIN_DMA
[  853.303958] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx      : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[  853.303967] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ DSP dump start ]------------
[  853.304053] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: DSP panic!
[  853.304092] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (7)
[  853.304144] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: 0x50000005: module: ROM_EXT, state: FW_ENTERED, running
[  853.304208] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Firmware state: 0x0, status/error code: 0x0
[  853.304272] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Core dump is not available due to invalid separator 0xc0de
[  853.304331] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ DSP dump end ]------------
[  853.304384] kernel: snd_sof:sof_set_fw_state: sof-audio-pci-intel-lnl 0000:00:1f.3: fw_state change: 7 -> 8
[  853.304405] kernel: snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc rx done : 0x1b0a0000|0x0: GLB_NOTIFICATION|EXCEPTION_CAUGHT
[  853.806657] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc timed out for 0xe070001|0x180
[  853.806749] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Attempting to prevent DSP from entering D3 state to preserve context
[  853.806759] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ IPC dump start ]------------
[  853.806819] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: Host IPC initiator: 0x8e070001|0x180|0x0, target: 0x1b0a0000|0x0|0x0, ctl: 0x3
[  853.806890] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ------------[ IPC dump end ]------------
[  853.806939] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: IPC timeout
[  853.806997] kernel: sof-audio-pci-intel-lnl 0000:00:1f.3: ASoC: error at soc_component_trigger on 0000:00:1f.3: -110
[  853.807065] kernel:  HDMI3: ASoC: trigger FE cmd: 1 failed: -110
[  853.302056]  dma: dma_get: dma_get() ID 0 sref = 2 busy channels 0
[  853.302070]  dma: dma_get: dma_get() ID 0 sref = 2 busy channels 0
[  853.302090]  chain_dma: chain_init: comp:0 0x0 chain_init(): dma_request_channel() failed
[  853.302098]  chain_dma: chain_task_start: comp:129 0x81 chain_task_start(), host_dma_id = 0x00000001
[  853.302103]  os: print_fatal_exception:  ** FATAL EXCEPTION
[  853.302110]  os: print_fatal_exception:  ** CPU 0 EXCCAUSE 13 (load/store PIF data error)
[  853.302115]  os: print_fatal_exception:  **  PC 0xa0079eb5 VADDR (nil)
[  853.302118]  os: print_fatal_exception:  **  PS 0x60720
[  853.302121]  os: print_fatal_exception:  **    (INTLEVEL:0 EXCM: 0 UM:1 RING:0 WOE:1 OWB:7 CALLINC:2)
[  853.302125]  os: xtensa_dump_stack:  **  A0 0xa0052325  SP 0xa010b820  A2 (nil)  A3 0x4011cc80
[  853.302130]  os: xtensa_dump_stack:  **  A4 0xa011cd40  A5 0x18  A6 0x401111a0  A7 0xa010b820
[  853.302133]  os: xtensa_dump_stack:  **  A8 0xa0062ab5  A9 0xa010b7e0 A10 0x401111a0 A11 0xa007bfd0
[  853.302136]  os: xtensa_dump_stack:  ** A12 0xa0062cd8 A13 0x1 A14 0xa A15 0xa010b760
[  853.302140]  os: xtensa_dump_stack:  ** LBEG 0xa0037405 LEND 0xa0037414 LCOUNT 0xa00626cb
[  853.302143]  os: xtensa_dump_stack:  ** SAR 0x1d
[  853.302146]  os: xtensa_dump_stack:  **  THREADPTR (nil)
ujfalusi commented 1 month ago

@bardliao, what happens is: after the last iteration on PCM 6 the test executes kill -9 "$pid" (and does not wait fort he termination) and moves to the next PCM (7), the kill for some reason does not happen right away, PCM7 is started (with host/link DMA id 1) then the stop to PCM6 comes, which places the (host/link 0) ChainDMA to PAUSED, then we stop the PCM7, that (host/link 1) goes to PAUSED then RESET but PCM6 is not moved to RESET (host/link 0) ???? When we start the PCM7 (host/link 1) -> firmware crash.

I don't see anything like this happening with other PRs...

ujfalusi commented 1 month ago

Logging the test result before re-triggering the test: 46626

ujfalusi commented 1 month ago

SOFCI TEST

bardliao commented 1 month ago

@ujfalusi Is it possible to bisect it? Can the issue be reproduced with linux-next kernel?

ujfalusi commented 1 month ago

I think the i2c bus is not probing and thus the two chromebook is without audio card as the codec is not probed.

ujfalusi commented 1 month ago

@ujfalusi Is it possible to bisect it? Can the issue be reproduced with linux-next kernel?

I would go with mainline first, it should work fine on LNL...

bardliao commented 1 month ago

I checked the stable v2.2 issue on ubuntu@jf-cml-hel-rt5682-05 the same device as the CI test and see below error.

[    5.350869] sof-audio-pci-intel-cnl 0000:00:1f.3: ------------[ DSP dump start ]------------
[    5.350913] sof-audio-pci-intel-cnl 0000:00:1f.3: Firmware boot failure due to timeout
[    5.350933] sof-audio-pci-intel-cnl 0000:00:1f.3: fw_state: SOF_FW_BOOT_IN_PROGRESS (3)
[    5.350957] sof-audio-pci-intel-cnl 0000:00:1f.3: 0x80000005: module: ROM, state: FW_ENTERED, not running
[    5.350981] sof-audio-pci-intel-cnl 0000:00:1f.3: status code: 0xbeef0000 (error: user exception)
[    5.351062] sof-audio-pci-intel-cnl 0000:00:1f.3: invalid header size 0x1010e0e. FW oops is bogus
[    5.351098] sof-audio-pci-intel-cnl 0000:00:1f.3: unexpected fault 0xbeef0000 trace 0x00000220
[    5.351119] sof-audio-pci-intel-cnl 0000:00:1f.3: ------------[ DSP dump end ]------------
[    5.351139] sof-audio-pci-intel-cnl 0000:00:1f.3: error: failed to boot DSP firmware -5

And that is due to an incorrect sof-cml.ri is used. The md5sum of the incorrect sof-cml.ri is dee17a3c329e560c08f104f0e35a59f6 which is updated on Oct 12th. Not sure what happened. After using the sof-cml.ri from jf-cml-hel-rt5682-01, the issue is gone.

bardliao commented 1 month ago

There is another issue on the stable-v2.2 test. snd_sof_load_topology is not called by the 6.12-rc2 kernel. The log below are seen with 6.11-rc6 kernel but not with 6.12-rc2 kernel.

snd_sof:snd_sof_load_topology: sof-audio-pci-intel-apl 0000:00:0e.0: loading topology:intel/sof-tplg/sof-glk-da7219.tplg
snd_sof:snd_sof_load_topology: sof-audio-pci-intel-cnl 0000:00:1f.3: loading topology:intel/sof-tplg/sof-cml-rt1011-rt5682.tplg

I will look into it.

bardliao commented 1 month ago

I think the i2c bus is not probing and thus the two chromebook is without audio card as the codec is not probed.

You are right. There is no i2c-10EC5682:00 when I check ls /sys/bus/i2c/devices/ with 6.12-rc2 kernel. ls /sys/bus/i2c/devices/ on 6.11-rc6 kernel: i2c-0 i2c-1 i2c-10 i2c-10EC1011:00 i2c-10EC1011:01 i2c-10EC1011:02 i2c-10EC1011:03 i2c-10EC5682:00 i2c-2 i2c-3 i2c-4 i2c-5 i2c-6 i2c-7 i2c-8 i2c-9 i2c-ELAN0000:00 i2c-GDIX0000:00 ls /sys/bus/i2c/devices/ on 6.12-rc2 kernel: i2c-0 i2c-1 i2c-2 i2c-3 i2c-4 i2c-5 i2c-6 i2c-7

bardliao commented 1 month ago

If I test with the https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git, the issue happens on the next-20240919 tag and not on the next-20240918 tag. So, the bad commit should be between next-20240918 and next-20240919.

ujfalusi commented 1 month ago

It is the i2c_designware which is not probing

ujfalusi commented 1 month ago

@bardliao, the i2c issue will be fixed by: https://github.com/thesofproject/kconfig/pull/101

bardliao commented 1 month ago

@bardliao, the i2c issue will be fixed by: thesofproject/kconfig#101

Thanks @ujfalusi I just found the same. haha.

ujfalusi commented 1 month ago

SOFCI TEST

ujfalusi commented 1 month ago

Let's see with the updated sof-kconfig...

bardliao commented 1 month ago

Test result looks good to me. Let's merge.