Open XiaoyunWu6666 opened 3 years ago
Weekly stress test(test result 4894) found same problem with CML_RVP_SDW.
===========================>>
[ 9449.385740] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 9449.387130] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 9449.388635] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 9449.390185] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 9449.391714] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 9449.393157] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 9449.394707] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 9449.396156] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 9449.397253] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 9449.398875] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 9449.400241] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
<<===========================
same warnings seen on Intel daily test 5356 (started on July 19).
The next daily tests after that will rely on https://github.com/thesofproject/linux/pull/3006 which was merged on July 20.
If we see this error again then my theory of a clock stop issue was wrong... Let's see!
Seems my theory was wrong, this error still happens with the latest kernel, however this is the first time I managed to see it on Dell SKU 09c6
Seen by @bardliao (c.f. https://github.com/thesofproject/linux/pull/3098#issuecomment-902559558)
Observed this issue again when running suspend/resume stress testing on CML_SKU0983_SDW. Test ID:8054.
[ 1592.161340] kernel: rt1308 sdw:1:025d:1308:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1592.292468] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init m_btl_l=0xffe2, m_btl_r=0x8
[ 1592.292481] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init c_btl_l=0x2ef, c_btl_r=0x10
[ 1592.301784] kernel: rt1308 sdw:1:025d:1308:00: rt1308_io_init hw_init complete
[ 1592.301790] kernel: rt1308 sdw:1:025d:1308:00: sdw_handle_slave_status: signaling initialization completion for Slave 1
[ 1592.314783] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 1592.316316] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 1592.317774] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 1592.319304] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 1592.320779] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 1592.322308] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 1592.323777] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 1592.325361] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 1592.326844] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 1592.328468] kernel: soundwire_intel soundwire_intel.link.2: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 1592.329841] kernel: soundwire_intel soundwire_intel.link.2: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
Linux: 8c3ae47e ASoC: Intel: TGL: add RT1308 I2S machine driver and HDMI-in
SOF: 40d2f6688367 codec_adapter: cadence: demote error message to warning when return code is not fatal
[ 909.886263] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.2.ALH0x103.IN id 11
[ 909.886266] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 909.886376] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 909.886377] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH0x102.OUT id 3
[ 909.886381] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 909.886508] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 909.886514] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 909.886599] kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 909.886605] kernel: soundwire_intel soundwire_intel.link.1: intel_link_power_up: powering up all links
[ 909.886606] kernel: soundwire_intel soundwire_intel.link.1: intel_link_power_up: first link up, programming SYNCPRD
[ 909.886902] kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[ 909.888903] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 909.890416] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 909.891905] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 909.893417] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 909.894905] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 909.896423] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 909.897955] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 909.899512] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 909.900954] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 909.902514] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 909.903951] kernel: soundwire_intel soundwire_intel.link.1: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
[ 909.906197] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
[ 909.906202] kernel: soundwire sdw-master-0: Slave attached, programming device number
[ 909.906442] kernel: soundwire sdw-master-0: SDW Slave Addr: 10025d070000
[ 909.906445] kernel: soundwire sdw-master-0: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0700, unique_id 0x0, version 0x1
[ 909.906447] kernel: soundwire sdw-master-0: Slave already registered, reusing dev_num:1
[ 909.906707] kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
[ 909.906708] kernel: soundwire sdw-master-0: No more devices to enumerate
[ 909.906745] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x21
[ 909.906748] kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 909.907092] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] ff01 <= 0000
[ 909.907237] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a
[ 909.907383] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003
[ 909.907466] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000
[ 909.907550] kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040
Not sure how we are going to diagnose this further, this problem seems to happen only on CML and only on a monthly/bimonthly basis - and it's not critical anyways.
Reproduction of the month has arrived https://sof-ci.01.org/sofpr/PR4334/build12063/devicetest/?model=CML_RVP_SDW&testcase=check-suspend-resume-with-playback-5
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.2.ALH0x103.IN id 11
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH0x102.OUT id 3
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: sof-audio-pci-intel-cnl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_resume: pm_runtime status was suspended, forcing active
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: powering up all links
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: first link up, programming SYNCPRD
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: Slave attached, programming device number
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: SDW Slave Addr: 10025d070000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0700, unique_id 0x0, version 0x1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire sdw-master-1: No more devices to enumerate
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x21
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] ff01 <= 0000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7520 85a0 <= 001a
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 7420 84a0 <= c003
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3501 <= 0000
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3721 <= 0040
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3714 <= 0040
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3c14 <= 0002
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3712 <= 0020
Feb 17 01:42:04 jf-cml-rvp-sdw-2 kernel: rt700 sdw:1:025d:0700:00: [rt700_sdw_write] 3713 <= 0020
Again in https://sof-ci.01.org/sofpr/PR5379/build12076/devicetest/?model=TGLU_RVP_SDW&testcase=check-suspend-resume-with-playback-5 , this time on TGL. Is this becoming more frequent?
EDIT again https://sof-ci.01.org/sofpr/PR5485/build12238/devicetest
[ 1665.018445] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: tplg: complete pipeline PIPELINE.1.ALH2.OUT id 3
[ 1665.018454] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1665.018648] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x30130000: GLB_TPLG_MSG: PIPE_COMPLETE
[ 1665.018662] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1665.018800] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x40020000: GLB_PM_MSG: CTX_RESTORE
[ 1665.018824] kernel: soundwire_intel soundwire_intel.link.0: intel_resume: pm_runtime status was suspended, forcing active
[ 1665.018830] kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: powering up all links
[ 1665.018831] kernel: soundwire_intel soundwire_intel.link.0: intel_link_power_up: first link up, programming SYNCPRD
[ 1665.019227] kernel: rt711 sdw:0:025d:0711:00: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[ 1665.020112] kernel: soundwire_intel soundwire_intel.link.0: Slave status change: 0x2
[ 1665.020115] kernel: soundwire sdw-master-0: Slave attached, programming device number
[ 1665.020336] kernel: soundwire sdw-master-0: SDW Slave Addr: 20025d071100
[ 1665.020337] kernel: soundwire sdw-master-0: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x0711, unique_id 0x0, version 0x2
[ 1665.020339] kernel: soundwire sdw-master-0: Slave already registered, reusing dev_num:1
[ 1665.020648] kernel: soundwire_intel soundwire_intel.link.0: Msg ignored for Slave 0
[ 1665.020649] kernel: soundwire sdw-master-0: No more devices to enumerate
[ 1665.020696] kernel: soundwire_intel soundwire_intel.link.0: Slave status change: 0x21
[ 1665.020715] kernel: rt711 sdw:0:025d:0711:00: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1665.020723] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: pm_runtime status was suspended, forcing active
[ 1665.020986] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_modify_slave_status: initializing enumeration and init completion for Slave 1
[ 1665.020989] kernel: rt1308 sdw:1:025d:1308:00:0: sdw_modify_slave_status: initializing enumeration and init completion for Slave 2
[ 1665.021043] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] ff01 <= 0000
[ 1665.021745] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 00008003
[ 1665.021835] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x2
[ 1665.021837] kernel: soundwire sdw-master-1: Slave attached, programming device number
[ 1665.022046] kernel: soundwire sdw-master-1: SDW Slave Addr: 22025d130800
[ 1665.022047] kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x1308, unique_id 0x2, version 0x2
[ 1665.022048] kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:1
[ 1665.022168] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 001a c003
[ 1665.022268] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0000
[ 1665.022359] kernel: soundwire sdw-master-1: SDW Slave Addr: 20025d130800
[ 1665.022359] kernel: soundwire sdw-master-1: SDW Slave class_id 0x00, mfg_id 0x025d, part_id 0x1308, unique_id 0x0, version 0x2
[ 1665.022361] kernel: soundwire sdw-master-1: Slave already registered, reusing dev_num:2
[ 1665.022363] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3719 <= 0025
[ 1665.022480] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3721 <= 00c0
[ 1665.022580] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3712 <= 0020
[ 1665.022651] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 1665.022672] kernel: soundwire_intel soundwire_intel.link.1: Msg ignored for Slave 0
[ 1665.022673] kernel: soundwire sdw-master-1: No more devices to enumerate
[ 1665.022697] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3713 <= 0020
[ 1665.022760] kernel: soundwire_intel soundwire_intel.link.1: Slave status change: 0x221
[ 1665.022763] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_modify_slave_status: signaling enumeration completion for Slave 1
[ 1665.022766] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 371a <= 0020
[ 1665.022876] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 371b <= 0020
[ 1665.023087] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7321 83a1 <= a080
[ 1665.023294] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7321 83a1 <= 9080
[ 1665.023466] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7308 8388 <= 6080
[ 1665.023669] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7308 8388 <= 5080
[ 1665.023716] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 1665.023835] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7309 8389 <= 6080
[ 1665.024045] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7309 8389 <= 5080
[ 1665.024136] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f12 <= 0091
[ 1665.024259] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e12 <= 00d6
[ 1665.024389] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d12 <= 0011
[ 1665.024508] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c12 <= 0020
[ 1665.024592] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f13 <= 0091
[ 1665.024687] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e13 <= 00d6
[ 1665.024796] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d13 <= 0011
[ 1665.024895] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c13 <= 0021
[ 1665.025002] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4c21 <= 00f0
[ 1665.025107] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4d21 <= 0011
[ 1665.025206] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4e21 <= 0011
[ 1665.025241] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 1665.025273] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 4f21 <= 0001
[ 1665.025686] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0091 0154
[ 1665.026106] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 004a 201b
[ 1665.026519] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0045 5089
[ 1665.026704] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 1665.026908] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 006b 5064
[ 1665.027311] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 0048 d249
[ 1665.027421] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0003
[ 1665.027520] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 3501 <= 0000
[ 1665.028213] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 1665.028353] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7520 85a0 9c20 aca0 => 0000058b
[ 1665.028769] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7520 85a0 7420 84a0 <= 006f 0580
[ 1665.029567] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 00007888
[ 1665.029706] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 1665.029956] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_write] 7558 85d8 7458 84d8 <= 0000 f888
[ 1665.030769] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.031227] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 1665.032717] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 1665.034230] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 1665.035713] kernel: soundwire_intel soundwire_intel.link.1: intel_resume: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 1665.037238] kernel: soundwire_intel soundwire_intel.link.1: intel_resume failed: MCP_CONTROL_HW_RST is not cleared
[ 1665.042413] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.054811] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.067418] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.080055] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.092605] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.105189] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.117837] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.130294] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.142719] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.151736] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init m_btl_l=0x4, m_btl_r=0x1b
[ 1665.151743] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init c_btl_l=0x17a, c_btl_r=0xd7
[ 1665.154757] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.162056] kernel: rt1308 sdw:1:025d:1308:00:2: rt1308_io_init hw_init complete
[ 1665.162061] kernel: rt1308 sdw:1:025d:1308:00:2: sdw_handle_slave_status: signaling initialization completion for Slave 1
[ 1665.162068] kernel: rt1308 sdw:1:025d:1308:00:0: sdw_modify_slave_status: signaling enumeration completion for Slave 2
[ 1665.165720] kernel: atkbd serio0: Failed to enable keyboard on isa0060/serio0
[ 1665.166739] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.179225] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.185363] kernel: OOM killer enabled.
[ 1665.185364] kernel: Restarting tasks ... done.
[ 1665.190172] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda: prepare stream dir 0
[ 1665.190195] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x80010000: GLB_DAI_MSG: CONFIG
[ 1665.190463] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x80010000: GLB_DAI_MSG: CONFIG
[ 1665.190534] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: format_val=49, rate=48000, ch=2, format=10
[ 1665.190897] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: prepare stream 8 dir 0
[ 1665.190940] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: hw params stream 8 dir 0
[ 1665.191004] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful
[ 1665.191055] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.191070] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x40000 successful
[ 1665.191076] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda_dsp_stream_setup_bdl: period_bytes:0x4000
[ 1665.191080] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: hda_dsp_stream_setup_bdl: periods:4
[ 1665.191103] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: stream_tag 1
[ 1665.191115] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1665.191488] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x60010000: GLB_STREAM_MSG: PCM_PARAMS
[ 1665.191493] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: stream dir 0, posn mailbox offset is 790604
[ 1665.191550] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: In hda_link_pcm_trigger cmd=1
[ 1665.191556] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: pcm: trigger stream 8 dir 0 cmd 1
[ 1665.191565] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: FW Poll Status: reg[0x160]=0x2014001e successful
[ 1665.191572] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1665.192463] kernel: sof-audio-pci-intel-tgl 0000:00:1f.3: ipc tx succeeded: 0x60040000: GLB_STREAM_MSG: TRIG_START
[ 1665.197312] kernel: rfkill: input handler enabled
[ 1665.198655] kernel: asix 3-3.1:1.0 enx000ec6b72c3c: Link is Down
[ 1665.199457] kernel: PM: suspend exit
[ 1665.202360] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.213196] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.224173] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
[ 1665.235150] kernel: rt711 sdw:0:025d:0711:00: [rt711_sdw_read] 7558 85d8 9c58 acd8 => 0000f888
Found same issue in internal daily test in 2022/2/18. It was reproduced in suspend/resume test with CML_SKU0983_SDW
.
10359?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50
Sorry for late report, I was looking at one of the devices for errors then found this errors too.
@keqiaozhang @XiaoyunWu6666 @marc-hb @greg-intel I haven't seen this report in a long time, is there a way to scan CI/test reports to see when we last say this string "MCP_CONTROL_HW_RST is not cleared"
The most recent was on March 28
daily 11433?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50
[ 6134.053681] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 6134.055091] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 6134.056714] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
Start Time: 2022-03-28 21:29:00 UTC Kernel Branch: topic/sof-dev Kernel Commit: b50c58edbe75 SOF Branch: main SOF Commit: 992b370c8cca Zephyr Commit: 199048b2ed45
Complete list (I searched only *_SDW
hardware and *suspend-resume*
tests)
7468/TGLU_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
7498/TGLU_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
7652/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
7853/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
7853/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
7979/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8073/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
8087/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
8270/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8428/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
8540/CML_SKU0983_SDW/check-suspend-resume-100/dmesg.html
8553/CML_SKU0983_SDW/check-suspend-resume-with-playback-5/dmesg.html
8622/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
8899/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
8977/CML_SKU0983_SDW/check-suspend-resume-with-playback-25/dmesg.html
9076/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9087/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
9093/TGLH_SKU0A5E_SDW/check-suspend-resume-with-playback-5/dmesg.html
9129/CML_RVP_SDW/check-suspend-resume-with-playback-25/dmesg.html
9153/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9154/ADLP_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9538/ADLP_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
9737/CML_RVP_SDW/check-suspend-resume-5/dmesg.html
9791/TGLU_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9800/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
9858/CML_RVP_SDW/check-suspend-resume-with-capture-5/dmesg.html
10101/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10276/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10314/TGLU_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
10359/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
10963/CML_RVP_SDW/check-suspend-resume-with-playback-5/dmesg.html
11074/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
11433/CML_SKU0983_SDW/check-suspend-resume-50/dmesg.html
and it's not critical anyways.
Can this be downgraded to a warning?
and it's not critical anyways.
Can this be downgraded to a warning?
if it's a warning we never see a notice from CI. That's why we explicitly added a dev_err to force such reports to show with a FAIL. I think we should probably change our filtering strategy at some point, only using the error level makes us miss WARN() statements in the code that signal abnormal execution.
I'd love CI to report warnings and not just errors at some point in the future and it does not seem impossible but it unfortunately seems to require a lot of non standard customizations
@greg-intel and others please correct me but Jenkins has a narrow, hardcoded and binary vision of the world: PASS or FAIL and that's it. Same for Github Actions, Quickbuild etc. The only successful workaround that I found (and that I've implemented in a couple places like https://github.com/thesofproject/sof/commit/05dded7b7b0f#diff-b71166ed0f585913318ed46933ff9b12901e211de3ac88c40de03f0a944c0ae0R42) is to build or test twice: once with -Werror
(warning as errors) and once without it.
Anyway we don't even fail tests on firmware ERRORs yet, so we have a some way to go https://github.com/thesofproject/sof-test/pull/799
Although github has a handful of status', they don't seem to be very customizable. I'm sure there's a plugin, but I'm anti-plugin.
I am curious, how many people check Jenkins multiple times every day?
Marc is basically right. With one infrequently used option to add a state:
Exit code to set build unstable:
If set, the shell exit code that will be interpreted as an unstable build result. If the exit code matches the value,
the build results will be set to 'unstable' and next steps will be continued. On Unix-like it is a value between 0-255.
The value 0 is ignored and does not make the build unstable to keep the default behaviour consistent.
Thanks @greg-intel
I am curious, how many people check Jenkins multiple times every day?
Extremely few I bet but that wasn't my point. My point was: to transmit a new, 3rd, immediate WARN status along our very long error reporting chain then all the links in the chain probably need to be aware of that new state and Jenkins is one pretty big link in the chain.
Same for the SKIP state.
Off topic sorry.
Observed this issue on ADLP_SKU0B00_SDCA. Test Report ID:13588
[ 7016.985835] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 7016.987345] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 7016.988855] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 7016.990369] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 7016.991875] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 7016.993381] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 7016.994887] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 7016.996396] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 7016.997909] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 7016.999418] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 7017.000927] kernel: soundwire_intel soundwire_intel.link.0: intel_resume_runtime BUS_RESET failed: MCP_CONTROL_HW_RST is not cleared
Still happening in today's daily 13945?model=CML_SKU0983_SDW&testcase=check-suspend-resume-50 (jf-cml-sku0983-sdw-1
)
Found new one in CML_RVP_SDW internal link: 17606?model=CML_RVP_SDW&testcase=check-suspend-resume-100
Found ADLP_BRYA_SDW with stable-v2.2. internal link: 20887?model=ADLP_BRYA_SDW&testcase=check-suspend-resume-50
PR #4225 change the way the reset is done, so it's super interesting if we see this error after April 17, 2023 in our test results.
Seen again in Intel daily tests
planresultdetail/39159?model=TGLU_SKU0A32_SDCA-ipc3&testcase=check-runtime-pm-status-15
[ 3757.004272] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 0
[ 3757.005828] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 1
[ 3757.007261] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 2
[ 3757.008806] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 3
[ 3757.010309] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 4
[ 3757.011854] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 5
[ 3757.013310] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 6
[ 3757.014856] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 7
[ 3757.016357] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 8
[ 3757.017927] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset: MCP_CONTROL_HW_RST is not cleared at iteration 9
[ 3757.019355] kernel: soundwire_intel soundwire_intel.link.0: intel_start_bus_after_reset failed: MCP_CONTROL_HW_RST is not cleared
[ 3762.647781] kernel: soundwire_intel soundwire_intel.link.0: sdw_cdns_clock_stop failed: MCP_CONTROL_HW_RST is not cleared
the report from last week is due to other problems and fixed by https://github.com/thesofproject/linux/pull/4896
Just seen again, re-opening to be on the safe side.
Good example of MAX saturation timeout that would be addressed by this:
EDIT: wrong issue, this was meant for sof-test # 1218
Humm, just one report on CML_RVP_SDW in over 14 months, that is going to be tricky to solve....
Another one:
It's coming back!
from https://sof-ci.01.org/softestpr/PR1220/build635/devicetest/index.html?model=CML_RVP_SDW-ipc3&testcase=check-runtime-pm-status, it looks like the first error started with the AMD machine driver rework
https://github.com/thesofproject/linux/commit/8245a1411fef45a208f182c176660107b7f582aa
I am not sure how this commit might have interfered with the low-level reset sequence on CML.
@bardliao @marc-hb @ssavati can you double-check if this bisect information seems correct?
I tried to reproduce this manually on one of the test devices jf-cml-rvp-sdw-3. So far no luck. I am already at 150 cycles of multiple-pipelines + pm_runtime suspend/resume and nothing logged.
2024-07-19 16:31:07 UTC [INFO] Test Result: PASS!
iteration 1568 done
I will stop the tests for now, unless we can reproduce it with a simple sequence we are not going to make progress or be able to bisect. Gah.
I am not sure how this commit might have interfered with the low-level reset sequence on CML.
This always had a low reproduction rate, didn't it? So I doubt this can be bisected.
description sdw_cdns_clock_stop failed: MCP_CONTROL_HW_RST is not cleared when check-suspend-resume-50 on CML_SKU0983_SDW happen in inner daily 4778 model=CML_SKU0983_SDW testcase=check-suspend-resume-50 [console log]
This is not a new issue, the dev_err logs above were added on purpose to help root-cause https://github.com/thesofproject/linux/issues/2606 which has occurred randomly for the last 6 months, so far only on CML_RVP.
EDIT: later seen on TGL too.