[BUG] multiple-pause-resume misses failure due to random wait times

ranj063 commented 3 years ago

Describe the bug pausing/releasing pipelines with random wait times results in missing the failure with the multiple-pause-resume test.

To Reproduce

diff --git a/test-case/multiple-pause-resume.sh b/test-case/multiple-pause-resume.sh
index f37bfaa..1ed1f82 100755
--- a/test-case/multiple-pause-resume.sh
+++ b/test-case/multiple-pause-resume.sh
@@ -119,14 +119,14 @@ spawn $cmd -D $dev -r $rate -c $channel -f $fmt -vv -i $file -q
 set i 1
 expect {
     "*#*+*\%" {
-        set sleep_t [expr int([expr rand() * $rnd_range]) + $rnd_min ]
+        set sleep_t 200
         puts "\r(\$i/$repeat_count) pcm'$pcm' cmd'$cmd' id'$idx': Wait for \$sleep_t ms before pause"
         send " "
         after \$sleep_t
         exp_continue
     }
     "*PAUSE*" {
-        set sleep_t [expr int([expr rand() * $rnd_range]) + $rnd_min ]
+        set sleep_t 200
         puts "\r(\$i/$repeat_count) pcm'$pcm' cmd'$cmd' id'$idx': Wait for \$sleep_t ms before resume"
         send " "
         after \$sleep_t

Expected behavior Pausing/release for the second DMIC pipeline after the first pipeline is stopped should work normally

Detail Info 1) Branch name and commit hash of the 3 repositories: sof (firmware/topology), linux (kernel driver) and sof-test (test case)

Kernel: d07af1319
SOF: d6e29ced24307
SOF-TEST: 18106e89f78 3) Test DUT Model (or a brief discribtion about the device)
MODEL: All HDA platforms with sof-hda-generic-4ch/2ch.tplg 4) Test TPLG
TPLG: sof-hda-generic-4ch/2ch.tplg 5) Test case (what test script and how you run it)
TESTCASE: ./multiple-pause-resume.sh -l 1

marc-hb commented 3 years ago

BTW both bash and expect have wait $pid commands to check the exit status of aplay / arecord commands. We should use wait every time some command is in the background, example in bash:

timeout -k 5 12 aplay -d 10 -bla -blah & aplayPID=$!
# do other things
wait $aplayPID || die "aplay ... failed" # or just set -e, see #312

It's "green failure crime" to ignore the exit status of audio commands in an audio test suite!

See help wait, man timeout and sof-test/test-case/check-volume-levels.sh for an example.

See also #312

ranj063 commented 3 years ago

https://github.com/thesofproject/sof/issues/4304 should have been caught earlier with this test

plbossart commented 3 years ago

expressions such as "set sleep_t [expr int([expr rand() * $rnd_range]) + $rnd_min ]" should be used to fuzz the sleep duration.

For CI we do want the sleep times to be a predictable pattern with a known/controlled seed that's reset to the same value every time the script is run.

Different requirements -> different solutions.

marc-hb commented 3 years ago

Control uses of rand() in test scripts #301

plbossart commented 3 years ago

Can't believe it's been nearly a year since we discussed this random seed thing...

ranj063 commented 3 years ago

@plbossart on the one hand, using a known seed will be good because the wait times will be pre-determined. But on the other hand, if this known seed causes the failure to be missed, we will always miss this. The problem I pointed out is because of the difference in behaviour when one DMIC PCM is stopped while the other is paused/unpaused. I feel like there's no point is randomizing the wait time at all.

marc-hb commented 3 years ago

Can't believe it's been nearly a year since we discussed this random seed thing...

The explanation is here:

cd sof-test
git log --pretty=tformat:'%h %as %an'

marc-hb commented 3 years ago

BTW both bash and expect have wait $pid commands to check the exit status of aplay / arecord commands. We should use wait every time some command is in the background,

Just filed Collect exit status of (aplay) commands started in the background #702

marc-hb commented 7 months ago

Should resurrect:

885

thesofproject / sof-test

[BUG] multiple-pause-resume misses failure due to random wait times #694

885