tarantool / test-run

Tarantool functional testing framework
14 stars 17 forks source link

Sometimes results of some tests disappear if an error occured #115

Closed ImeevMA closed 5 years ago

ImeevMA commented 6 years ago

Error:

[001] box/errinj.test.lua                                             
[001] Worker "001_box" received the following error; stopping...
[001] Traceback (most recent call last):
[001]   File "/home/mimeev/Work/radiant/test-run/lib/worker.py", line 266, in run_task
[001]     task, self.server, self.inspector)
[001]   File "/home/mimeev/Work/radiant/test-run/lib/test_suite.py", line 193, in run_test
[001]     short_status = test.run(server)
[001]   File "/home/mimeev/Work/radiant/test-run/lib/test.py", line 166, in run
[001]     self.execute(server)
[001]   File "/home/mimeev/Work/radiant/test-run/lib/tarantool_server.py", line 147, in execute
[001]     crash_occured = save_join(lua, timeout=self.TIMEOUT)
[001]   File "/home/mimeev/Work/radiant/test-run/lib/tarantool_server.py", line 50, in save_join
[001]     green_obj.join(timeout=timeout)
[001]   File "/home/mimeev/.local/lib/python2.7/site-packages/gevent/greenlet.py", line 483, in join
[001]     result = self.parent.switch()
[001]   File "/home/mimeev/.local/lib/python2.7/site-packages/gevent/hub.py", line 538, in switch
[001]     return greenlet.switch(self)
[001] TarantoolStartError
[001] 
[001] 
[001] 
[001] [Instance "proxy" returns with non-zero exit code: 1]
[001] 
[001] Last 15 lines of Tarantool Log file [Instance "proxy"][/home/mimeev/Work/radiant/test/var/001_box/proxy.log]:
[001] 2018-08-17 11:15:09.911 [27927] main/106/console/127.0.0.1:38426 I> started
[001] 2018-08-17 11:15:09.911 [27927] main C> entering the event loop
[001] 2018-08-17 11:15:10.252 [27927] main C> got signal 15 - Terminated
[001] Starting instance proxy...
[001] Run console at unix/:/home/mimeev/Work/radiant/test/var/001_box/proxy.control
[001] tcp_server: remove dead UNIX socket: /home/mimeev/Work/radiant/test/var/001_box/proxy.control
[001] started
[001] 2018-08-17 11:15:11.619 [28413] main/101/proxy C> Tarantool 1.9.1-70-gc1a16b2
[001] 2018-08-17 11:15:11.619 [28413] main/101/proxy C> log level 5
[001] 2018-08-17 11:15:11.619 [28413] main/101/proxy I> mapping 117440512 bytes for memtx tuple arena...
[001] 2018-08-17 11:15:11.619 [28413] main/101/proxy I> mapping 134217728 bytes for vinyl tuple arena...
[001] 2018-08-17 11:15:11.625 [28413] iproto/101/main evio.cc:351 E> binary: failed to bind on 127.0.0.1:38194: bind, called on fd 29, aka 0.0.0.0:0
[001] 2018-08-17 11:15:11.625 [28413] main/101/proxy evio.cc:356 !> SystemError binary: failed to bind, called on fd -1: Address already in use
[001] 2018-08-17 11:15:11.625 [28413] main/101/proxy F> can't initialize storage: binary: failed to bind, called on fd -1
[001] 2018-08-17 11:15:11.625 [28413] main/101/proxy F> can't initialize storage: binary: failed to bind, called on fd -1

How to reproduce:

TEST_RUN_TESTS=$(for i in $(seq 1 100); do echo -n 'box/errinj. '; done) make test

Possible result (should pass 100):

---------------------------------------------------------------------------------
Statistics:
* pass: 96
Built target test

OS:

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:  Ubuntu 16.04.5 LTS
Release:  16.04
Codename: xenial
Totktonada commented 6 years ago

The 'Address already in use' error is likely due to the way test-run assign free ports to tarantool servers. It checks for a free port, then pass it to the tarantool via an environment variable and there is the lag between the check and actual bind. It becomes rarely and have no (known for me) ways to mitigate, because we should save the port in the tarantool_server class to connect to the server to pass lines of a test. Maybe we can read a port tarantool binds from logs and set python's variable then.

Anyway, it is not the problem of this issue. The problem is that test-run does not count the test in this case. It should be counted as failed ones.

Totktonada commented 6 years ago

Raw idea: choose ports via the coordinator: the main process.

Totktonada commented 5 years ago

Now we have use_unix_sockets_iproto option. Setting both use_unix_sockets and use_unix_sockets_iproto likely will help with 'address already in use' issue. But this issue is about handling of TarantoolStartError (for default and non-default servers).

Totktonada commented 5 years ago

Another case:

TEST_RUN_TESTS=$(for i in $(seq 1 500); do echo -n "tarantoolctl.test.lua "; done) make test
[008] Worker "008_app-tap" received the following error; stopping...
[008] Traceback (most recent call last):
[008]   File "/home/alex/projects/tarantool-meta/r/tarantool/test-run/lib/worker.py", line 291, in run_task
[008]     task, self.server, self.inspector)
[008]   File "/home/alex/projects/tarantool-meta/r/tarantool/test-run/lib/test_suite.py", line 207, in run_test
[008]     short_status = test.run(server)
[008]   File "/home/alex/projects/tarantool-meta/r/tarantool/test-run/lib/test.py", line 181, in run
[008]     self.execute(server)
[008]   File "/home/alex/projects/tarantool-meta/r/tarantool/test-run/lib/app_server.py", line 43, in execute
[008]     tarantool.join()
[008]   File "src/gevent/greenlet.py", line 661, in gevent._greenlet.Greenlet.join
[008]   File "src/gevent/greenlet.py", line 687, in gevent._greenlet.Greenlet.join
[008]   File "src/gevent/greenlet.py", line 676, in gevent._greenlet.Greenlet.join
[008]   File "src/gevent/_greenlet_primitives.py", line 59, in gevent.__greenlet_primitives.SwitchOutGreenletWithLoop.switch
[008]   File "src/gevent/_greenlet_primitives.py", line 63, in gevent.__greenlet_primitives.SwitchOutGreenletWithLoop.switch
[008]   File "src/gevent/__greenlet_primitives.pxd", line 35, in gevent.__greenlet_primitives._greenlet_switch
[008] TarantoolStartError
[008] 
[008] Exception: 
Statistics:
* pass: 499
Built target test
echo $?
0
cat test/var/008_app-tap/remote.log
<...>
2019-02-18 15:23:22.934 [27133] main/101/remote tarantoolctl:541 E> Start failed: builtin/box/console.lua:560: failed to create server localhost:44152: Address already in use
Totktonada commented 5 years ago

Moved out 'address already in use' part into #141.

Totktonada commented 5 years ago

The problem should be fixed by 7176cede7a884ffc44d3a0e0bc856c9121510ba6 for 'core = tarantool' test suites. I'll propose the fix for 'core = app' tests soon.

Totktonada commented 5 years ago

The case with tarantoolctl.test.lua is a kinda specific: we check whether a task was reported using its ID (a test name + a configuration). When the same test runs many times; test-run is not able to detect that the test was run less times then should. It should not affect regular testing, so this part of the issue is not important.