thesofproject / linux

Linux kernel source tree
Other
91 stars 131 forks source link

[SoundWire] intel_resume: MCP_CONTROL_HW_RST not cleared at iteration (dev_err logs) #3012

Open XiaoyunWu6666 opened 3 years ago

XiaoyunWu6666 commented 3 years ago

description sdw_cdns_clock_stop failed: MCP_CONTROL_HW_RST is not cleared when check-suspend-resume-50 on CML_SKU0983_SDW happen in inner daily 4778 model=CML_SKU0983_SDW testcase=check-suspend-resume-50 [console log]

[ 4876.595463] kernel: soundwire_intel soundwire_intel.link.0: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
……
[ 4876.608536] kernel: soundwire_intel soundwire_intel.link.0: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 4876.610117] kernel: soundwire_intel soundwire_intel.link.0: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
[ 4880.767350] kernel: soundwire_intel soundwire_intel.link.0: sdw_cdns_clock_stop failed: MCP_CONTROL_HW_RST is not cleared

This is not a new issue, the dev_err logs above were added on purpose to help root-cause https://github.com/thesofproject/linux/issues/2606 which has occurred randomly for the last 6 months, so far only on CML_RVP.

EDIT: later seen on TGL too.

fredoh9 commented 3 years ago

Weekly stress test(test result 4894) found same problem with CML_RVP_SDW.

===========================>>
[ 9449.385740] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 9449.387130] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 9449.388635] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 9449.390185] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 9449.391714] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 9449.393157] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 9449.394707] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 9449.396156] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 9449.397253] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 9449.398875] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 9449.400241] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
<<===========================

dmesg_test_4894_cml.txt

plbossart commented 3 years ago

same warnings seen on Intel daily test 5356 (started on July 19).

The next daily tests after that will rely on https://github.com/thesofproject/linux/pull/3006 which was merged on July 20.

If we see this error again then my theory of a clock stop issue was wrong... Let's see!

plbossart commented 3 years ago

Seems my theory was wrong, this error still happens with the latest kernel, however this is the first time I managed to see it on Dell SKU 09c6

dmesg.txt

plbossart commented 3 years ago

Seen by @bardliao (c.f. https://github.com/thesofproject/linux/pull/3098#issuecomment-902559558)

plbossart commented 3 years ago

Issue seen in https://sof-ci.sh.intel.com/#/result/planresultdetail/7312?model=CML_SKU0983_SDW&testcase=multiple-pause-resume-25

plbossart commented 2 years ago

Observed this issue again when running suspend/resume stress testing on CML_SKU0983_SDW. Test ID:8054.

[ 1592.161340] kernel: rt1308 sdw:1:025d:1308:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1592.292468] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init m_btl_l=0xffe2, m_btl_r=0x8
[ 1592.292481] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init c_btl_l=0x2ef, c_btl_r=0x10
[ 1592.301784] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init hw_init complete
[ 1592.301790] kernel: rt1308 sdw:1:025d:1308:00: sdw_handle_slave_status: signaling initialization completion for Slave 1
[ 1592.314783] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 1592.316316] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 1592.317774] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 1592.319304] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 1592.320779] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 1592.322308] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 1592.323777] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 1592.325361] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 1592.326844] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 1592.328468] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 1592.329841] kernel: soundwire_intel soundwire_intel.link.2: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
marc-hb commented 2 years ago

https://sof-ci.01.org/sofpr/PR5291/build11860/devicetest/?model=CML_RVP_SDW&testcase=volume-basic-test

Linux: 8c3ae47e ASoC: Intel: TGL: add RT1308 I2S machine driver and HDMI-in SOF: 40d2f6688367 codec_adapter: cadence: demote error message to warning when return code is not fatal

[  909.886263] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.2.ALH0x103.IN id 11
[  909.886266] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[  909.886376] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[  909.886377] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH0x102.OUT id 3
[  909.886381] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[  909.886508] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[  909.886514] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[  909.886599] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[  909.886605] kernel: soundwire_intel soundwire_intel.link.1: intel_link_power_up: powering up all links
[  909.886606] kernel: soundwire_intel soundwire_intel.link.1: intel_link_power_up: first link up, programming SYNCPRD
[  909.886902] kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[  909.888903] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[  909.890416] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[  909.891905] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[  909.893417] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[  909.894905] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[  909.896423] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[  909.897955] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[  909.899512] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[  909.900954] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[  909.902514] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[  909.903951] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
[  909.906197] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
[  909.906202] kernel: soundwire sdw-master-0: Slave attached, programming device number
[  909.906442] kernel: soundwire sdw-master-0: SDW Slave Addr: 10025d070000
[  909.906445] kernel: soundwire sdw-master-0: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0700, unique_id 0x0, version 0x1
[  909.906447] kernel: soundwire sdw-master-0: Slave already registered, reusing dev_num:1
[  909.906707] kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
[  909.906708] kernel: soundwire sdw-master-0: No more devices to enumerate
[  909.906745] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x21
[  909.906748] kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[  909.907092] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] ff01 <= 0000
[  909.907237] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a
[  909.907383] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003
[  909.907466] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000
[  909.907550] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040
plbossart commented 2 years ago

Not sure how we are going to diagnose this further, this problem seems to happen only on CML and only on a monthly/bimonthly basis - and it's not critical anyways.

marc-hb commented 2 years ago

Reproduction of the month has arrived https://sof-ci.01.org/sofpr/PR4334/build12063/devicetest/?model=CML_RVP_SDW&testcase=check-suspend-resume-with-playback-5

Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.2.ALH0x103.IN id 11
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH0x102.OUT id 3
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_resume: pm_runtime status was suspended, forcing active
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: powering up all links
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: first link up, programming SYNCPRD
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: Slave attached, programming device number
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: SDW Slave Addr: 10025d070000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0700, unique_id 0x0, version 0x1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: No more devices to enumerate
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x21
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] ff01 <= 0000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3714 <= 0040
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3c14 <= 0002
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3712 <= 0020
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3713 <= 0020
marc-hb commented 2 years ago

Again in https://sof-ci.01.org/sofpr/PR5379/build12076/devicetest/?model=TGLU_RVP_SDW&testcase=check-suspend-resume-with-playback-5 , this time on TGL. Is this becoming more frequent?

EDIT again https://sof-ci.01.org/sofpr/PR5485/build12238/devicetest

[ 1665.018445] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH2.OUT id 3
[ 1665.018454] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1665.018648] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1665.018662] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1665.018800] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1665.018824] kernel: soundwire_intel soundwire_intel.link.0: intel_resume: pm_runtime status was suspended, forcing active
[ 1665.018830] kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: powering up all links
[ 1665.018831] kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: first link up, programming SYNCPRD
[ 1665.019227] kernel: rt711 sdw:0:025d:0711:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[ 1665.020112] kernel: soundwire_intel soundwire_intel.link.0: Slave status change: 0x2
[ 1665.020115] kernel: soundwire sdw-master-0: Slave attached, programming device number
[ 1665.020336] kernel: soundwire sdw-master-0: SDW Slave Addr: 20025d071100
[ 1665.020337] kernel: soundwire sdw-master-0: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0711, unique_id 0x0, version 0x2
[ 1665.020339] kernel: soundwire sdw-master-0: Slave already registered, reusing dev_num:1
[ 1665.020648] kernel: soundwire_intel soundwire_intel.link.0: Msg ignored for Slave 0
[ 1665.020649] kernel: soundwire sdw-master-0: No more devices to enumerate
[ 1665.020696] kernel: soundwire_intel soundwire_intel.link.0: Slave status change: 0x21
[ 1665.020715] kernel: rt711 sdw:0:025d:0711:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1665.020723] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: pm_runtime status was suspended, forcing active
[ 1665.020986] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[ 1665.020989] kernel: rt1308 sdw:1:025d:1308:00:0: sdw_modify_slave_status: initializing enumeration and init completion for Slave 2
[ 1665.021043] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] ff01 <= 0000
[ 1665.021745] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00008003
[ 1665.021835] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
[ 1665.021837] kernel: soundwire sdw-master-1: Slave attached, programming device number
[ 1665.022046] kernel: soundwire sdw-master-1: SDW Slave Addr: 22025d130800
[ 1665.022047] kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x1308, unique_id 0x2, version 0x2
[ 1665.022048] kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:1
[ 1665.022168] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 001a c003
[ 1665.022268] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0000
[ 1665.022359] kernel: soundwire sdw-master-1: SDW Slave Addr: 20025d130800
[ 1665.022359] kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x1308, unique_id 0x0, version 0x2
[ 1665.022361] kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:2
[ 1665.022363] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3719 <= 0025
[ 1665.022480] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3721 <= 00c0
[ 1665.022580] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3712 <= 0020
[ 1665.022651] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 1665.022672] kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
[ 1665.022673] kernel: soundwire sdw-master-1: No more devices to enumerate
[ 1665.022697] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3713 <= 0020
[ 1665.022760] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x221
[ 1665.022763] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1665.022766] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 371a <= 0020
[ 1665.022876] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 371b <= 0020
[ 1665.023087] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7321 83a1 <= a080
[ 1665.023294] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7321 83a1 <= 9080
[ 1665.023466] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7308 8388 <= 6080
[ 1665.023669] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7308 8388 <= 5080
[ 1665.023716] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 1665.023835] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7309 8389 <= 6080
[ 1665.024045] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7309 8389 <= 5080
[ 1665.024136] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f12 <= 0091
[ 1665.024259] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e12 <= 00d6
[ 1665.024389] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d12 <= 0011
[ 1665.024508] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c12 <= 0020
[ 1665.024592] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f13 <= 0091
[ 1665.024687] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e13 <= 00d6
[ 1665.024796] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d13 <= 0011
[ 1665.024895] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c13 <= 0021
[ 1665.025002] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c21 <= 00f0
[ 1665.025107] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d21 <= 0011
[ 1665.025206] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e21 <= 0011
[ 1665.025241] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 1665.025273] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f21 <= 0001
[ 1665.025686] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0091 0154
[ 1665.026106] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 004a 201b
[ 1665.026519] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0045 5089
[ 1665.026704] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 1665.026908] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 006b 5064
[ 1665.027311] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0048 d249
[ 1665.027421] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0003
[ 1665.027520] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0000
[ 1665.028213] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 1665.028353] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 0000058b
[ 1665.028769] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 006f 0580
[ 1665.029567] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 00007888
[ 1665.029706] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 1665.029956] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7558 85d8 7458 84d8 <= 0000 f888
[ 1665.030769] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.031227] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 1665.032717] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 1665.034230] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 1665.035713] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 1665.037238] kernel: soundwire_intel soundwire_intel.link.1: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
[ 1665.042413] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.054811] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.067418] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.080055] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.092605] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.105189] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.117837] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.130294] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.142719] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.151736] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init m_btl_l=0x4, m_btl_r=0x1b
[ 1665.151743] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init c_btl_l=0x17a, c_btl_r=0xd7
[ 1665.154757] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.162056] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init hw_init complete
[ 1665.162061] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_handle_slave_status: signaling initialization completion for Slave 1
[ 1665.162068] kernel: rt1308 sdw:1:025d:1308:00:0: sdw_modify_slave_status: signaling enumeration completion for Slave 2
[ 1665.165720] kernel: atkbd serio0: Failed to enable keyboard on isa0060/serio0
[ 1665.166739] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.179225] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.185363] kernel: OOM killer enabled.
[ 1665.185364] kernel: Restarting tasks ... done.
[ 1665.190172] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda: prepare stream dir 0
[ 1665.190195] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[ 1665.190463] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x80010000: GLB_DAI_MSG: CONFIG
[ 1665.190534] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: format_val=49, rate=48000, ch=2, format=10
[ 1665.190897] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: prepare stream 8 dir 0
[ 1665.190940] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: hw params stream 8 dir 0
[ 1665.191004] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful
[ 1665.191055] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.191070] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful
[ 1665.191076] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda_dsp_stream_setup_bdl: period_bytes:0x4000
[ 1665.191080] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda_dsp_stream_setup_bdl: periods:4
[ 1665.191103] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: stream_tag 1
[ 1665.191115] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1665.191488] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1665.191493] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: stream dir 0, posn mailbox offset is 790604
[ 1665.191550] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: In hda_link_pcm_trigger cmd=1
[ 1665.191556] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: trigger stream 8 dir 0 cmd 1
[ 1665.191565] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x2014001e successful
[ 1665.191572] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1665.192463] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1665.197312] kernel: rfkill: input handler enabled
[ 1665.198655] kernel: asix 3-3.1:1.0 enx000ec6b72c3c: Link is Down
[ 1665.199457] kernel: PM: suspend exit
[ 1665.202360] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.213196] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.224173] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.235150] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
fredoh9 commented 2 years ago

Found same issue in internal daily test in 2022/2/18. It was reproduced in suspend/resume test with CML_SKU0983_SDW. 10359?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50

Sorry for late report, I was looking at one of the devices for errors then found this errors too.

plbossart commented 2 years ago

@keqiaozhang @XiaoyunWu6666 @marc-hb @greg-intel I haven't seen this report in a long time, is there a way to scan CI/test reports to see when we last say this string "MCP_CONTROL_HW_RST is not cleared"

marc-hb commented 2 years ago

The most recent was on March 28

daily 11433?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50

[ 6134.053681] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 6134.055091] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 6134.056714] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared

Start Time: 2022-03-28 21:29:00 UTC Kernel Branch: topic/sof-dev Kernel Commit: b50c58edbe75 SOF Branch: main SOF Commit: 992b370c8cca Zephyr Commit: 199048b2ed45

Complete list (I searched only *_SDW hardware and *suspend-resume* tests)

7468/TGLU_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
7498/TGLU_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
7652/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
7853/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
7853/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
7979/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8073/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
8087/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
8270/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8428/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
8540/CML_SKU0983_SDW/check-suspend-resume-100/dmesg.html
8553/CML_SKU0983_SDW/check-suspend-resume-with-playback-5/dmesg.html
8622/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8899/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
8977/CML_SKU0983_SDW/check-suspend-resume-with-playback-25/dmesg.html
9076/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9087/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
9093/TGLH_SKU0A5E_SDW/check-suspend-resume-with-playback-5/dmesg.html
9129/CML_RVP_SDW/check-suspend-resume-with-playback-25/dmesg.html
9153/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9154/ADLP_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9538/ADLP_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9737/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
9791/TGLU_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9800/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9858/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
10101/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10276/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10314/TGLU_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10359/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
10963/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
11074/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
11433/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
marc-hb commented 2 years ago

and it's not critical anyways.

Can this be downgraded to a warning?

plbossart commented 2 years ago

and it's not critical anyways.

Can this be downgraded to a warning?

if it's a warning we never see a notice from CI. That's why we explicitly added a dev_err to force such reports to show with a FAIL. I think we should probably change our filtering strategy at some point, only using the error level makes us miss WARN() statements in the code that signal abnormal execution.

marc-hb commented 2 years ago

I'd love CI to report warnings and not just errors at some point in the future and it does not seem impossible but it unfortunately seems to require a lot of non standard customizations

@greg-intel and others please correct me but Jenkins has a narrow, hardcoded and binary vision of the world: PASS or FAIL and that's it. Same for Github Actions, Quickbuild etc. The only successful workaround that I found (and that I've implemented in a couple places like https://github.com/thesofproject/sof/commit/05dded7b7b0f#diff-b71166ed0f585913318ed46933ff9b12901e211de3ac88c40de03f0a944c0ae0R42) is to build or test twice: once with -Werror (warning as errors) and once without it.

Anyway we don't even fail tests on firmware ERRORs yet, so we have a some way to go https://github.com/thesofproject/sof-test/pull/799

greg-intel commented 2 years ago

Although github has a handful of status', they don't seem to be very customizable. I'm sure there's a plugin, but I'm anti-plugin.

I am curious, how many people check Jenkins multiple times every day?

Marc is basically right. With one infrequently used option to add a state:

Exit code to set build unstable:
If set, the shell exit code that will be interpreted as an unstable build result. If the exit code matches the value, 
the build results will be set to 'unstable' and next steps will be continued. On Unix-like it is a value between 0-255. 
The value 0 is ignored and does not make the build unstable to keep the default behaviour consistent.
marc-hb commented 2 years ago

Thanks @greg-intel

I am curious, how many people check Jenkins multiple times every day?

Extremely few I bet but that wasn't my point. My point was: to transmit a new, 3rd, immediate WARN status along our very long error reporting chain then all the links in the chain probably need to be aware of that new state and Jenkins is one pretty big link in the chain.

Same for the SKIP state.

Off topic sorry.

keqiaozhang commented 2 years ago

Observed this issue on ADLP_SKU0B00_SDCA. Test Report ID:13588

[ 7016.985835] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 7016.987345] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 7016.988855] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 7016.990369] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 7016.991875] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 7016.993381] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 7016.994887] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 7016.996396] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 7016.997909] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 7016.999418] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 7017.000927] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
marc-hb commented 2 years ago

Still happening in today's daily 13945?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50 (jf-cml-sku0983-sdw-1)

fredoh9 commented 1 year ago

Found new one in CML_RVP_SDW internal link: 17606?model=CML_RVP_SDW&testcase=check-suspend-resume-100

fredoh9 commented 1 year ago

Found ADLP_BRYA_SDW with stable-v2.2. internal link: 20887?model=ADLP_BRYA_SDW&testcase=check-suspend-resume-50

plbossart commented 1 year ago

PR #4225 change the way the reset is done, so it's super interesting if we see this error after April 17, 2023 in our test results.

plbossart commented 7 months ago

Seen again in Intel daily tests

planresultdetail/39159?model=TGLU_SKU0A32_SDCA-ipc3&testcase=check-runtime-pm-status-15

[ 3757.004272] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 3757.005828] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 3757.007261] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 3757.008806] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 3757.010309] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 3757.011854] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 3757.013310] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 3757.014856] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 3757.016357] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 3757.017927] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 3757.019355] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset failed: MCP_CONTROL_HW_RST is not cleared
[ 3762.647781] kernel: soundwire_intel soundwire_intel.link.0: sdw_cdns_clock_stop failed: MCP_CONTROL_HW_RST is not cleared
plbossart commented 7 months ago

the report from last week is due to other problems and fixed by https://github.com/thesofproject/linux/pull/4896

marc-hb commented 3 months ago

Just seen again, re-opening to be on the safe side.

https://sof-ci.01.org/softestpr/PR1220/build635/devicetest/index.html?model=CML_RVP_SDW-ipc3&testcase=check-runtime-pm-status

``` [ 589.632188] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE [ 589.632316] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE [ 589.632398] kernel: soundwire_intel:intel_link_power_up: soundwire_intel soundwire_intel.link.1: powering up all links [ 589.632400] kernel: soundwire_intel:intel_link_power_up: soundwire_intel soundwire_intel.link.1: first link up, programming SYNCPRD [ 589.632584] kernel: soundwire_bus:sdw_modify_slave_status: rt700 sdw:0:1:025d:0700:00: initializing enumeration and init completion for Slave 6 [ 589.632591] kernel: soundwire_cadence:cdns_init_clock_ctrl: soundwire_intel soundwire_intel.link.1: mclk 12000000 max 6000000 row 125 col 2 [ 589.633471] kernel: soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.1: Slave status change: 0x1000000 [ 589.633895] kernel: soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2 [ 589.633911] kernel: soundwire_bus:sdw_handle_slave_status: soundwire sdw-master-0-1: Slave attached, programming device number [ 589.634078] kernel: soundwire_bus:sdw_extract_slave_id: soundwire sdw-master-0-1: SDW Slave Addr: 10025d070000 [ 589.634080] kernel: soundwire_bus:sdw_extract_slave_id: soundwire sdw-master-0-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0700, unique_id 0x0, version 0x1 [ 589.634083] kernel: soundwire_bus:sdw_assign_device_num: soundwire sdw-master-0-1: Slave already registered, reusing dev_num:6 [ 589.634284] kernel: soundwire_cadence:cdns_fill_msg_resp: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0 [ 589.634287] kernel: soundwire_bus:sdw_program_device_num: soundwire sdw-master-0-1: No more devices to enumerate [ 589.634309] kernel: soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2000001 [ 589.634322] kernel: soundwire_bus:sdw_modify_slave_status: rt700 sdw:0:1:025d:0700:00: signaling enumeration completion for Slave 6 [ 589.634453] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 0 [ 589.634537] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] ff01 <= 0000 [ 589.634676] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a [ 589.634781] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003 [ 589.634843] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000 [ 589.634906] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040 [ 589.634967] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3714 <= 0040 [ 589.635031] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3c14 <= 0002 [ 589.635092] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3712 <= 0020 [ 589.635155] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3713 <= 0020 [ 589.635218] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3719 <= 0020 [ 589.635280] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f12 <= 0091 [ 589.635343] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e12 <= 00d6 [ 589.635405] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d12 <= 0011 [ 589.635468] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c12 <= 0020 [ 589.635535] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f13 <= 0091 [ 589.635593] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e13 <= 00d6 [ 589.635655] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d13 <= 0011 [ 589.635719] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c13 <= 0021 [ 589.635780] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f19 <= 0002 [ 589.635842] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e19 <= 00a1 [ 589.635905] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d19 <= 0090 [ 589.635967] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c19 <= 0080 [ 589.636030] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 371b <= 0040 [ 589.636064] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 1 [ 589.636155] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 731b 839b <= 00b0 [ 589.636259] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 839b 731b <= 0000 [ 589.636468] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 004a 201b [ 589.636697] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0045 5089 [ 589.636884] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 006b 5064 [ 589.637113] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0048 d249 [ 589.637176] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0003 [ 589.637238] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000 [ 589.637301] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3819 <= 0082 [ 589.637363] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3821 <= 0081 [ 589.637426] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3855 <= 0083 [ 589.637453] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 2 [ 589.637676] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0010 2420 [ 589.637886] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0019 2e11 [ 589.637888] kernel: snd_soc_rt700:rt700_jack_init: rt700 sdw:0:1:025d:0700:00: in rt700_jack_init enable [ 589.637947] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0003 [ 589.637950] kernel: snd_soc_rt700:rt700_io_init: rt700 sdw:0:1:025d:0700:00: rt700_io_init hw_init complete [ 589.637952] kernel: soundwire_bus:sdw_handle_slave_status: rt700 sdw:0:1:025d:0700:00: signaling initialization completion for Slave 6 [ 589.638963] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 3 [ 589.640453] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 4 [ 589.641962] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 5 [ 589.643453] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 6 [ 589.644962] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 7 [ 589.646449] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 8 [ 589.647958] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 9 [ 589.649452] kernel: soundwire_intel soundwire_intel.link.1: intel_start_bus_after_reset failed: MCP_CONTROL_HW_RST is not cleared [ 589.649511] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3121 <= 0000 [ 589.649570] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0003 [ 589.649633] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3602 <= 0000 [ 589.649695] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3609 <= 0000 [ 589.649757] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3712 <= 0020 [ 589.649799] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3713 <= 0020 [ 589.649840] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3714 <= 0040 [ 589.649882] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3719 <= 0020 [ 589.649924] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 371b <= 0040 [ 589.649965] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040 [ 589.650007] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3819 <= 0082 [ 589.650049] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3821 <= 0081 [ 589.650090] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3855 <= 0083 [ 589.650132] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 3c14 <= 0002 [ 589.650174] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c12 <= 0020 [ 589.650215] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c13 <= 0021 [ 589.650257] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4c19 <= 0080 [ 589.650299] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d12 <= 0011 [ 589.650340] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d13 <= 0011 [ 589.650382] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4d19 <= 0090 [ 589.650424] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e12 <= 00d6 [ 589.650465] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e13 <= 00d6 [ 589.650507] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4e19 <= 00a1 [ 589.650549] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f12 <= 0091 [ 589.650590] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f13 <= 0091 [ 589.650632] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 4f19 <= 0002 [ 589.650715] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7203 8283 <= 0031 [ 589.650799] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7209 8289 <= 0031 [ 589.650882] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7302 8382 <= b050 [ 589.650965] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7308 8388 <= 7080 [ 589.651049] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7309 8389 <= 7080 [ 589.651132] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7312 8392 <= 0000 [ 589.651215] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7313 8393 <= 0000 [ 589.651299] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7314 8394 <= b080 [ 589.651382] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7319 8399 <= 7002 [ 589.651465] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 731b 839b <= 00b0 [ 589.651549] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7321 83a1 <= b080 [ 589.651632] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003 [ 589.651715] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a [ 589.651799] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8382 7302 <= b050 [ 589.651882] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8383 7303 <= 0057 [ 589.651965] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8388 7308 <= 7080 [ 589.652049] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8389 7309 <= 7080 [ 589.652132] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8393 7313 <= 0000 [ 589.652216] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 8399 7319 <= 7002 [ 589.652319] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 839b 731b <= 0000 [ 589.652507] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0010 2420 [ 589.652694] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0019 2e11 [ 589.652861] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 001a 8003 [ 589.653028] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0045 5089 [ 589.653194] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 0048 d249 [ 589.653361] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 004a 201b [ 589.653550] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 7420 84a0 <= 006b 5064 [ 589.653560] kernel: snd_sof:sof_pcm_open: sof-audio-pci-intel-cnl 0000:00:1f.3: pcm: open stream 0 dir 0 [ 589.653562] kernel: snd_sof:sof_pcm_open: sof-audio-pci-intel-cnl 0000:00:1f.3: period min 192 max 16384 bytes [ 589.653564] kernel: snd_sof:sof_pcm_open: sof-audio-pci-intel-cnl 0000:00:1f.3: period count 2 max 16 [ 589.653566] kernel: snd_sof:sof_pcm_open: sof-audio-pci-intel-cnl 0000:00:1f.3: buffer max 65536 bytes [ 589.653686] kernel: snd_soc_rt700:rt700_pcm_hw_params: rt700 sdw:0:1:025d:0700:00: rt700_pcm_hw_params rt700-aif1 [ 589.653778] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7203 8283 <= 0031 [ 589.653882] kernel: snd_soc_rt700:rt700_sdw_write: rt700 sdw:0:1:025d:0700:00: [rt700_sdw_write] 7209 8289 <= 0031 [ 589.653890] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG [ 589.654089] kernel: snd_sof:sof_pcm_hw_params: sof-audio-pci-intel-cnl 0000:00:1f.3: pcm: hw params stream 0 dir 0 [ 589.654094] kernel: snd_sof_intel_hda_common:hda_dsp_stream_hw_params: sof-audio-pci-intel-cnl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful [ 589.654110] kernel: snd_sof_intel_hda_common:hda_dsp_stream_hw_params: sof-audio-pci-intel-cnl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful [ 589.654113] kernel: snd_sof_intel_hda_common:hda_dsp_stream_setup_bdl: sof-audio-pci-intel-cnl 0000:00:1f.3: period_bytes:0x4000 [ 589.654115] kernel: snd_sof_intel_hda_common:hda_dsp_stream_setup_bdl: sof-audio-pci-intel-cnl 0000:00:1f.3: periods:4 [ 589.654135] kernel: snd_sof:sof_ipc3_pcm_hw_params: sof-audio-pci-intel-cnl 0000:00:1f.3: stream_tag 1 [ 589.654136] kernel: snd_sof:ipc3_log_header: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS ```
marc-hb commented 3 months ago

Good example of MAX saturation timeout that would be addressed by this:

https://sof-ci.01.org/softestpr/PR1220/build633/devicetest/index.html?model=LNLM_RVP_HDA&testcase=check-pause-resume-capture-100

EDIT: wrong issue, this was meant for sof-test # 1218

plbossart commented 3 months ago

Humm, just one report on CML_RVP_SDW in over 14 months, that is going to be tricky to solve....

marc-hb commented 3 months ago

Another one:

https://sof-ci.01.org/linuxpr/PR5106/build4064/devicetest/index.html?model=CML_RVP_SDW-ipc3&testcase=multiple-pipeline-all

It's coming back!

plbossart commented 3 months ago

from https://sof-ci.01.org/softestpr/PR1220/build635/devicetest/index.html?model=CML_RVP_SDW-ipc3&testcase=check-runtime-pm-status, it looks like the first error started with the AMD machine driver rework

https://github.com/thesofproject/linux/commit/8245a1411fef45a208f182c176660107b7f582aa

I am not sure how this commit might have interfered with the low-level reset sequence on CML.

@bardliao @marc-hb @ssavati can you double-check if this bisect information seems correct?

plbossart commented 3 months ago

I tried to reproduce this manually on one of the test devices jf-cml-rvp-sdw-3. So far no luck. I am already at 150 cycles of multiple-pipelines + pm_runtime suspend/resume and nothing logged.

plbossart commented 3 months ago
2024-07-19 16:31:07 UTC [INFO] Test Result: PASS!
iteration 1568 done

I will stop the tests for now, unless we can reproduce it with a simple sequence we are not going to make progress or be able to bisect. Gah.

marc-hb commented 3 months ago

I am not sure how this commit might have interfered with the low-level reset sequence on CML.

This always had a low reproduction rate, didn't it? So I doubt this can be bisected.