Open KonsKo opened 1 month ago
One more similar case:
If an exception is caught, it is being hidden (all information is provided, but the situation is not considered as an error)
Example (there is an exception but test is considered as successfull):
test_load_distribution (t_sched.test_h2_ratio_weight.RatioH2.test_load_distribution) ... 24-09-27 21:15:35.321801 CRITICAL | Exception in stopping tf_cfg.py:245
process: ("Cmd
'/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta/scripts/te
mpesta.sh --stop' got timed out
(timeout value is 30)",
b'Stopping
Tempesta...\nnet.tempesta.state
= stop\n...unload Tempesta
modules\nUn-loading Tempesta
kernel modules...\ndone\n',
b'Unit irqbalance.service could
not be found.\n', -9), type:
<class
'helpers.remote.CmdError'>,
traceback:
Traceback (most recent call
last):
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/remote.py", line 106, in
run_cmd
stdout, stderr =
p.communicate(timeout=timeout)
^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^
File
"/usr/lib/python3.11/subprocess.
py", line 1209, in communicate
stdout, stderr =
self._communicate(input,
endtime, timeout)
^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/lib/python3.11/subprocess.
py", line 2116, in _communicate
self._check_timeout(endtime,
orig_timeout, stdout, stderr)
File
"/usr/lib/python3.11/subprocess.
py", line 1253, in
_check_timeout
raise TimeoutExpired(
subprocess.TimeoutExpired:
Command
'/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta/scripts/te
mpesta.sh --stop' timed out
after 30 seconds
During handling of the above
exception, another exception
occurred:
Traceback (most recent call
last):
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/stateful.py", line 68, in
force_stop
stop_proc()
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/control.py", line 355, in
stop_tempesta
self.node.run_cmd(cmd,
timeout=30,
err_msg=(self.err_msg % "stop"))
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/remote.py", line 114, in
run_cmd
raise CmdError(
helpers.remote.CmdError: ("Cmd
'/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta/scripts/te
mpesta.sh --stop' got timed out
(timeout value is 30)",
b'Stopping
Tempesta...\nnet.tempesta.state
= stop\n...unload Tempesta
modules\nUn-loading Tempesta
kernel modules...\ndone\n',
b'Unit irqbalance.service could
not be found.\n', -9)
ok. **<------ Success**
Fails on a teardown or a cleanup should be considered as fail.
One more example:
test_wait_for_connections_negative (selftests.test_lxc_server.TestLxcServer.test_wait_for_connections_negative) ... 24-09-30 23:20:32.577133 CRITICAL | Exception in stopping tf_cfg.py:245
process: ("Cmd 'lxc stop
tempesta-site-stage' got timed
out (timeout value is 10)", b'',
b'', -9), type: <class
'helpers.remote.CmdError'>,
traceback:
Traceback (most recent call
last):
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/remote.py", line 106, in
run_cmd
stdout, stderr =
p.communicate(timeout=timeout)
^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^
File
"/usr/lib/python3.11/subprocess.
py", line 1209, in communicate
stdout, stderr =
self._communicate(input,
endtime, timeout)
^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/lib/python3.11/subprocess.
py", line 2116, in _communicate
self._check_timeout(endtime,
orig_timeout, stdout, stderr)
File
"/usr/lib/python3.11/subprocess.
py", line 1253, in
_check_timeout
raise TimeoutExpired(
subprocess.TimeoutExpired:
Command 'lxc stop
tempesta-site-stage' timed out
after 10 seconds
During handling of the above
exception, another exception
occurred:
Traceback (most recent call
last):
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/stateful.py", line 68, in
force_stop
stop_proc()
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/frame
work/lxc_server.py", line 153,
in stop_server
self.node.run_cmd(self._cons
truct_cmd(["stop",
self.container_name]))
File
"/home/jenkins/workspace/Tests_o
n_bare_metal/tempesta-test/helpe
rs/remote.py", line 114, in
run_cmd
raise CmdError(
helpers.remote.CmdError: ("Cmd
'lxc stop tempesta-site-stage'
got timed out (timeout value is
10)", b'', b'', -9)
ok **<------ Success**
Current implementation of the
framework.client.Client
has a hidden error(perhaps this is architecture feature). But it is not easy to debug with such realization.Let's look at a situation: we have a test that work with a client, for my case it is
ratecheck
(t_frang.test_connection_rate_burst.FrangTcpVsBoth.test_burst
).Now, I already knew, that the client did not exist on my machine (was not installed).
But when I run the test with low details first times I received something like:
The real problem is hidden.
I can set more detailed logs and I may see an error, but the error displayed in the middle of log output and I have to search for it, and it is just a message (it is colored now, but we added logs processing recently).
Real problem here: https://github.com/tempesta-tech/tempesta-test/blob/fc5e2e38a7d4be292f15b1b3c3b5fcd015dd7023/framework/client.py#L8
run_cmd
raises an exception if an error was caught, but here it become hidden.We need to explicit (re)raise an exception in such cases to provide an user correct information.
I have not updated the code right away, because the client works with
multiprocessing
and we need to think about the best way how to implement it, otherwise side effects may occur.