thesofproject / linux

Linux kernel source tree
Other
91 stars 131 forks source link

[stable-v2.2] ADLP_RVP_NOCODEC has ipc error, reply size (44) exceeds the buffer size #4069

Open fredoh9 opened 1 year ago

fredoh9 commented 1 year ago

Found one error during testing simultaneous-playback-capture-25 with stable-v2.2

Internal link: 18319?model=ADLP_RVP_NOCODEC&testcase=simultaneous-playback-capture-25

[ 4389.260800] kernel: snd_sof:sof_widget_setup: sof-audio-pci-intel-tgl 0000:00:1f.3: widget SSP1.IN setup complete
[ 4389.260806] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30200000: GLB_TPLG_MSG: BUFFER_NEW
[ 4389.260840] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: reply size (44) exceeds the buffer size (20)
[ 4389.261017] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx error for 0x30200000 (msg/reply size: 44/20): -22
[ 4389.261033] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: Failed to setup widget BUF4.1
[ 4389.261046] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30020000: GLB_TPLG_MSG: COMP_FREE
[ 4389.763370] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx timed out for 0x30020000 (msg/reply size: 12/12)
[ 4389.763453] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: preventing DSP entering D3 state to preserve context
[ 4389.763463] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump start ]------------
[ 4389.763535] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda irq intsts 0x00000000 intlctl 0xc0000081 rirb 00
[ 4389.763588] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: dsp irq ppsts 0x00000000 adspis 0x00000000
[ 4389.763655] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: error: host status 0x00000000 dsp status 0x00000000 mask 0x00000003
[ 4389.763711] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ IPC dump end ]------------
[ 4389.763755] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump start ]------------
[ 4389.763799] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: IPC timeout
[ 4389.763832] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: fw_state: SOF_FW_BOOT_COMPLETE (6)
[ 4389.763901] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: 0x00000005: module: ROM, state: FW_ENTERED, running
[ 4389.764426] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: unexpected fault 0x00000000 trace 0x00004000
[ 4389.764481] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ------------[ DSP dump end ]------------
[ 4389.764535] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: failed to free widget SSP1.IN
[ 4389.764597] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30110000: GLB_TPLG_MSG: PIPE_FREE
[ 4389.764651] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: trace IO error

etrace has poll timeout and xrun.

[   525588711.510814] (    10078011.000000) c0 dma-trace                    src/lib/wait.c:53   ERROR poll timeout reg 488520 mask 63 val 0 us 937
[   536355319.520488] (    10766608.000000) c2 dma-trace                    src/lib/wait.c:53   ERROR poll timeout reg 487496 mask 63 val 0 us 937
[   547446828.975584] (    11091509.000000) c3 dma-trace              src/drivers/dw/dma.c:1128 ERROR dw_dma_get_data_size(): xrun detected
[   547446851.788083] (          22.812500) c3 dma-trace                   src/audio/dai.c:881  ERROR dai_report_xrun(): underrun due to no data available
[   547446874.496416] (          22.708332) c3 dma-trace                   src/audio/dai.c:775  ERROR comp_underrun(): dev->comp.id = 14, source->avail = 0, copy_bytes = 0
[   547446941.527663] (          67.031250) c3 dma-trace          ....../pipeline-stream.c:143  ERROR pipeline_copy(): ret = -61, start->comp.id = 14, dir = 1
[   547446964.913079] (          23.385416) c3 dma-trace          ..../pipeline-schedule.c:187  ERROR pipeline_task(): xrun recovery failed! pipeline is stopped.

Attached logs:

etrace-adl-nocodec.txt dmesg-adl-nocodec.txt slogger-adl-nocodec.txt

plbossart commented 1 year ago

@ujfalusi any idea of what might have happened here? If I understand correctly for regular IPC we always allocate a sizeof(reply) so not sure what causes the firmware to report more than is provided.

It could also be an error at the firmware level leading to a corrupted reply size.

ujfalusi commented 1 year ago

And the size of struct sof_ipc_reply is 12 (2x u32 + 1x i32) ? And to add: the firmware should reply with:

    struct sof_ipc_comp_reply reply = {
        .rhdr.hdr = {
            .cmd = header,
            .size = sizeof(reply),
        },
    };
...
    /* write component values to the outbox */
    mailbox_hostbox_write(0, &reply, sizeof(reply));

to the buffer_new. Something is not adding up for sure, this can not be. The only explanation is that the kernel is corrupting a prepared IPC message template and that is sent out, but still, we log it as BUFFER_NEW, so the reply must be 12...