metan-ucw / runltp-ng

Minimalistic LTP testrunner
11 stars 16 forks source link

Stucked for a very long time before moving on #18

Open shunghsiyu opened 4 years ago

shunghsiyu commented 4 years ago

Some times I observe VMs getting stucked (by looking at the last modified time of log file such as syzkaller1.raw), but after a very long time, it will eventually move forward (so not the same as #16). The command I give to runltp-ng looks something like this

runltp-ng --run=syzkaller1 --logname=syzkaller1 \
  --backend=qemu:ram=4G:smp=2:image=$IMG"

After adding the --verbose option and a few trails, I eventually got a log at the point where it gets stucked.

qemu: localhost login: root
Writing string '*******'
Writing string 'export PS1=$ ' 
Writing string '! [ -e /opt/ltp ]; echo cmd-exit-0-$?'
Waiting for regexp '(?^:cmd-exit-0-\d+)'
qemu: Password:
qemu: Last login: Wed Jun 17 07:45:16 on hvc0
qemu: localhost:~ # export PS1=$
qemu: $! [ -e /opt/ltp ]; echo cmd-exit-0-$?
qemu: cmd-exit-0-1
Cmd exit value 1
Writing string 'uname; echo cmd-exit-1-$?'
Waiting for regexp '(?^:cmd-exit-1-\d+)'
qemu: $uname; echo cmd-exit-1-$?
qemu: Linux
qemu: cmd-exit-1-0
Cmd exit value 0
Writing string 'for i in m p r; do printf uname-$i; uname -$i; done; echo cmd-e'
Waiting for regexp '(?^:cmd-exit-2-\d+)'
qemu: for i in m p r; do printf uname-$i; uname -$i; done; echo cmd-exit-2-$? <--- stucked here

I think the main issue is that runltp-ng entered the command too quickly(?), before the bash prompt can react, and while we do have retry mechanisms in utils::run_cmds_retry, the default timeout of 3600 means it will take quite a while before the command is retried.

richiejp commented 4 years ago

This is a common problem in os-autoinst/openqa as well. I have found that 'expect'/'autoexpect' does not have this issue if you limit the input speed. I'm wondering weather we should use Expect for handling I/O with the SUT as this is a solved problem already. I will create another issue for that.