metan-ucw / runltp-ng

Minimalistic LTP testrunner
11 stars 16 forks source link

runltp-ng: Recover on timeout #2

Closed cfconrad closed 5 years ago

cfconrad commented 5 years ago

Improve timeout handling. The idea is to retry the sequence of commands once one failed in timeout. The reboot function of the backend is used to recover the SUT.

This is a approach of how we could handle the timeout. If it goes in the right direction I will change all the other shell command calls.

metan-ucw commented 5 years ago

There are some minor issues, but other than that it looks good to me.

cfconrad commented 5 years ago

@metan-ucw I changed now hopeful all needed run_cmd() calls to that utils::run_cmd_retry() method. The run_cmd() calls, related to test trigger and taint check, are not touched. From my perspective we will have here special treatment, like a brok result count.

metan-ucw commented 5 years ago

Btw it would probably make sense to add a parameter to retry some of the commands passed to tst_run_cmds() i.e. things that connect to network such as wget, git, apt-get and zypper make sense to be retried if they fail because of broken connection, but for that the command array would have to became array of tuples and the second parameter would enable retry for the particular command.

cfconrad commented 5 years ago

Hm I'm not sure, if I get the idea. What I got is, you would like to pass a array out of tuple, to backend:run_cmds(). And if a command is marked with retry, it will rerun the command until the exit code is 0 or limit reached. I like this idea. What I didn't get, where tst_run_cmds() comes into the game.

metan-ucw commented 5 years ago

The tst_ is just typo, of course I meant backend:run_cmds().

cfconrad commented 5 years ago

ok, got it https://github.com/metan-ucw/runltp-ng/issues/4