Closed plbossart closed 3 months ago
reproduced with kernel topic/sof-dev d99d9a0ab917 ALSA: hda: intel-sdw-acpi: use acpi_get_local_u64_address() SOF main : 737d4d41fbaa app: add winconsole_overlay.conf
TPLG=/lib/firmware/intel/sof-ipc4-tplg/sof-mtl-rt713-l0-rt1316-l12.tplg ~/sof-test/test-case/run-all-tests.sh -l1
The minimal sequence to make the problem appear is this in test-case/run-all-tests.sh
testlist=" multiple-pause-resume kmod-load-unload "
Without the multiple-pause-resume no errors are reported.
this also happens when I remove everything except 'Jack Out"
starting test_multiple-pause-resume
+ test_multiple-pause-resume
+ /root/sof-test/test-case/multiple-pause-resume.sh -l 1 -r 25
2024-05-24 20:18:56 UTC Sub-Test: [INFO] /root/sof-test/test-case/multiple-pause-resume.sh will use topology /usr/lib/firmware/intel/sof-ipc4-tplg/sof-mtl-rt713-l0-rt1316-l12.tplg to run the test case
2024-05-24 20:18:56 UTC Sub-Test: [INFO] Pipeline list to ignore is specified, will ignore 'pcm=HDA Digital,HDMI1,HDMI2,HDMI3,Deepbuffer Jack Out,Speaker,Amp feedback,Jack In' in test case
2024-05-24 20:18:56 UTC Sub-Test: [INFO] Run command to get pipeline parameters
2024-05-24 20:18:56 UTC Sub-Test: [COMMAND] sof-tplgreader.py /usr/lib/firmware/intel/sof-ipc4-tplg/sof-mtl-rt713-l0-rt1316-l12.tplg -f 'type:any & ~pcm:Amplifier Reference' -b ' pcm:HDA Digital,HDMI1,HDMI2,HDMI3,Deepbuffer Jack Out,Speaker,Amp feedback,Jack In' -s 0 -e
2024-05-24 20:18:56 UTC Sub-Test: [INFO] Starting /usr/local/bin/mtrace-reader.py >& /root/sof-test/logs/multiple-pause-resume/2024-05-24-16:18:56-14946/mtrace.txt &
2024-05-24 20:18:57 UTC Sub-Test: [WARNING] pipeline count is 1, don't need to run this case
----------
----------
starting test_kmod-load-unload
+ test_kmod-load-unload
+ /root/sof-test/test-case/check-kmod-load-unload.sh -l 1
2024-05-24 20:19:00 UTC Sub-Test: [INFO] ===== Starting iteration 1 of 1 =====
2024-05-24 20:19:00 UTC Sub-Test: [INFO] wait dsp power status to become suspended
2024-05-24 20:19:00 UTC Sub-Test: [INFO] run kmod/sof-kmod-remove.sh
WARNING: running as root is not supported
Specified filename /sys/kernel/debug/sof/trace does not exist.
SKIP snd_usb_audio not loaded
SKIP snd_hda_intel not loaded
SKIP snd_sof_pci_intel_tng not loaded
SKIP snd_sof_pci_intel_skl not loaded
SKIP snd_sof_pci_intel_apl not loaded
SKIP snd_sof_pci_intel_tgl not loaded
SKIP snd_sof_pci_intel_icl not loaded
SKIP snd_sof_pci_intel_cnl not loaded
SKIP snd_sof_pci_intel_lnl not loaded
RMMOD snd_sof_pci_intel_mtl
looks like a test script problem maybe. I can't reproduce this with the simpler sequence
sudo TPLG=/lib/firmware/intel/sof-ipc4-tplg/sof-mtl-rt713-l0-rt1316-l12.tplg ~/sof-test/test-case/multiple-pause-resume.sh -l 1 -r 25
~/sof-test/test-case/check-kmod-load-unload.sh -l 2
maybe it's the debugfs stuff that we read now that causes a problem while removing debugfs. Or we have an open debugfs file.
Looks like no one is using run-all-tests.sh?
Just saw this as well on LNL
starting test_kmod-load-unload
+ test_kmod-load-unload
+ /home/ubuntu/sof-test/test-case/check-kmod-load-unload.sh -l 1
2024-06-19 08:24:57 UTC Sub-Test: [INFO] ===== Starting iteration 1 of 1 =====
2024-06-19 08:24:57 UTC Sub-Test: [INFO] wait dsp power status to become suspended
2024-06-19 08:24:57 UTC Sub-Test: [INFO] run kmod/sof-kmod-remove.sh
Specified filename /sys/kernel/debug/sof/trace does not exist.
RMMOD snd_usb_audio
SKIP snd_hda_intel not loaded
SKIP snd_sof_pci_intel_tng not loaded
SKIP snd_sof_pci_intel_skl not loaded
SKIP snd_sof_pci_intel_apl not loaded
SKIP snd_sof_pci_intel_tgl not loaded
SKIP snd_sof_pci_intel_icl not loaded
SKIP snd_sof_pci_intel_cnl not loaded
RMMOD snd_sof_pci_intel_lnl
[ 277.387525] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[ 277.539501] snd_soc_rt711_sdca:rt711_sdca_calibration: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_calibration calibration complete, ret=0
[ 277.541983] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx reply: 0x67000000|0x0: MOD_SET_DX
[ 277.542024] snd_sof:sof_ipc4_log_header: sof-audio-pci-intel-lnl 0000:00:1f.3: ipc tx done : 0x47000000|0x0: MOD_SET_DX [data size: 8]
[ 277.542037] Message payload: 00000000: 00000001 00000000
[ 277.547478] snd_soc_rt711_sdca:rt711_sdca_jack_init: rt711-sdca sdw:0:0:025d:0711:01: in rt711_sdca_jack_init enable
[ 277.547489] snd_soc_rt711_sdca:rt711_sdca_io_init: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_io_init hw_init complete
[ 277.547493] soundwire_bus:sdw_handle_slave_status: rt711-sdca sdw:0:0:025d:0711:01: signaling initialization completion for Slave 6
[ 277.552306] soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.0: Slave status change: 0x4000000
[ 277.565812] snd_soc_rt711_sdca:rt711_sdca_interrupt_callback: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_interrupt_callback control_port_stat=4, sdca_cascade=1
[ 277.568162] soundwire_cadence:cdns_update_slave_status_work: soundwire_intel soundwire_intel.link.0: Slave status change: 0x2000000
[ 277.669458] snd_soc_rt711_sdca:rt711_sdca_interrupt_callback: rt711-sdca sdw:0:0:025d:0711:01: rt711_sdca_interrupt_callback control_port_stat=4, sdca_cascade=1
[ 280.639869] soundwire_bus:sdw_bus_wait_for_clk_prep_deprep: soundwire sdw-master-0-3: clock stop prepare done slave:15
[ 280.639910] soundwire_bus:sdw_bus_wait_for_clk_prep_deprep: soundwire sdw-master-0-1: clock stop prepare done slave:15
[ 280.640037] soundwire_bus:sdw_bus_wait_for_clk_prep_deprep: soundwire sdw-master-0-2: clock stop prepare done slave:15
[ 281.306958] soundwire_bus:sdw_bus_wait_for_clk_prep_deprep: soundwire sdw-master-0-0: clock stop prepare done slave:15
The pause-resume tests had many implementation issues, I just rewrote it in https://github.com/thesofproject/sof-test/pull/1218 Can you try again?
It's not surprising that kmod-load-unload could hang if the pause-resume test left something bad behind. Examples:
CI is generally more robust that run-all-tests.sh so it's less likely to leave something bad behind. Not impossible but less likely in CI.
@plbossart can we close after https://github.com/thesofproject/sof-test/pull/1218 ?
I haven't seen this problem in a very very long time, closing
I've seen repeated issues with the kmod-load-unload not completing on MTL Dell SKU 0CC7
Updating the firmware didn't help. I can still log-in remotely but the device has to be rebooted for audio tests