Open Gerold103 opened 4 years ago
Executed suite box
100 times and all tests passed.
Tarantool 2.4.0-99-g7ec7ced60
Target: Linux-x86_64-Debug
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /usr/bin/cc /usr/bin/c++
C_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -fno-gnu89-inline -Wno-cast-function-type -Werror
CXX_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -Wno-invalid-offsetof -Wno-cast-function-type -Werror
Is there any missed details or any other hints how to reproduce?
Finally reproduced box/net.box.test.lua
hang five times out of 100 (48, 77, 82, 89, 90).
Tarantool 2.4.0-99-g7ec7ced60
Target: Linux-x86_64-Debug
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /usr/bin/cc /usr/bin/c++
C_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -fno-gnu89-inline -Wno-cast-function-type -Werror
CXX_FLAGS: -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -fopenmp -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-format-truncation -Wno-invalid-offsetof -Wno-cast-function-type -Werror
Command line:
for i in `seq 1 1 100`; do echo "$i XXXXXXXXXXXXXXXXX"; ../../test/test-run.py --builddir=/home/s.bronnikov/tarantool/build --vardir=/home/s.bronnikov/tarantool/build/test/var --suite box; done 2>&1 | tee ../../box-suite-100times.log
Hardware: mcs1, CentOS Linux release 8.0.1905 (Core)
I propose to wait a patch from @avtikhon with splitting netbox.test.lua
to a set of independent tests and then investigate a bug.
I guess Vlad test it on Mac OS. I tried it with 2.5.0-173-gd1b3fbe9c (on tt-mac):
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.12.6
BuildVersion: 16G29
$ cmake . -DCMAKE_BUILD_TYPE=Debug -DENABLE_BACKTRACE=ON -DENABLE_DIST=ON -DENABLE_BUNDLED_LIBCURL=OFF -DOPENSSL_ROOT_DIR=/usr/local/Cellar/openssl/1.0.2q/ && make -j
$ . ~/env-2.7/bin/activate # environment with test-run/requirements.txt libraries installed
$ git diff # remove fragile list to don't confuse --reproduce
diff --git a/test/box/suite.ini b/test/box/suite.ini
index de8f5a70e..98c5c7cb0 100644
--- a/test/box/suite.ini
+++ b/test/box/suite.ini
@@ -9,16 +9,3 @@ lua_libs = lua/fifo.lua lua/utils.lua lua/bitset.lua lua/index_random_test.lua l
use_unix_sockets = True
use_unix_sockets_iproto = True
is_parallel = True
pretest_clean = True
-fragile = bitset.test.lua ; tarantool/tarantool-qa#235
- func_reload.test.lua ; tarantool/tarantool-qa#15
- function1.test.lua ; tarantool/tarantool#4199
- net.box.test.lua ; tarantool/tarantool#3851 tarantool/tarantool#4383
- alter_limits.test.lua ; tarantool/tarantool#4926
- misc.test.lua ; tarantool/tarantool-qa#223
- tuple.test.lua ; tarantool/tarantool-qa#219
- transaction.test.lua ; tarantool/tarantool-qa#217
- rtree_rect.test.lua ; tarantool/tarantool-qa#214
- sequence.test.lua ; tarantool/tarantool-qa#213
- on_replace.test.lua ; tarantool/tarantool-qa#212
- role.test.lua ; tarantool/tarantool-qa#211
$ cat r.yml
- [box/space_bsize.test.lua, null]
- [box/sql.test.lua, null]
- [box/rtree_errinj.test.lua, null]
- [box/rtree_array.test.lua, null]
- [box/update.test.lua, null]
- [box/cfg.test.lua, null]
- [box/net_msg_max.test.lua, null]
- [box/access_misc.test.lua, null]
- [box/access_escalation.test.lua, null]
- [box/iproto_stress.test.lua, null]
- [box/role.test.lua, null]
- [box/blackhole.test.lua, null]
- [box/misc.test.lua, null]
- [box/tree_pk.test.lua, null]
- [box/transaction.test.lua, null]
$ (cd test && ./test-run.py --reproduce ../r.yml)
Catched the following fail:
[001] box/iproto_stress.test.lua [ fail ]
[001]
[001] Test failed! Result content mismatch:
[001] --- box/iproto_stress.result Wed Apr 3 06:25:04 2019
[001] +++ box/iproto_stress.reject Thu Jul 2 23:30:20 2020
[001] @@ -74,11 +74,11 @@
[001] ...
[001] test_run:wait_cond(function() return n_workers == 0 end, 60)
[001] ---
[001] -- true
[001] +- false
[001] ...
[001] n_workers -- 0
[001] ---
[001] -- 0
[001] +- 100
[001] ...
[001] n_errors -- 0
[001] ---
[001]
[001] Last 15 lines of Tarantool Log file [Instance "box"][/Users/a.turenko/tarantool/test/var/001_box/box.log]:
[001] 2020-07-02 23:29:20.049 [74501] main/9053/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.051 [74501] main/9055/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.054 [74501] main/9039/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.054 [74501] main/9041/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.054 [74501] main/9049/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.056 [74501] main/9047/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.057 [74501] main/9043/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.057 [74501] main/9045/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.205 [74501] main/8957/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.206 [74501] main/8955/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.206 [74501] main/8953/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.207 [74501] main/8949/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.209 [74501] main/8947/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:29:20.209 [74501] main/8951/lua utils.c:1005 E> LuajitError: [string "function worker(i) n_workers = n_workers ..."]:1: attempt to index field 'test' (a nil value)
[001] 2020-07-02 23:30:20.098 [74501] main/159/console/unix/: I> set 'net_msg_max' configuration option to 768
[Main process] Got failed test; gently terminate all workers...
[001] Worker "001_box" got failed test; stopping the server...
However when I removed the pretest_clean
suite.ini option (by mistake), then I got various miscompares on box/access_misc.test.lua and box/role.test.lua. Also once I catched segfault in mp_tuple_assert()
like in tarantool/tarantool-qa#235.
However once I catched the following miscompare with pretest_clean
:
[001] Test failed! Result content mismatch:
[001] --- box/role.result Thu Aug 30 01:10:07 2018
[001] +++ box/role.reject Thu Jul 2 23:44:02 2020
[001] @@ -605,27 +605,35 @@
[001] ...
[001] box.schema.role.drop("role1")
[001] ---
[001] +- error: Unsupported role privilege 'execute'
[001] ...
[001] box.schema.role.drop("role2")
[001] ---
[001] +- error: Unsupported role privilege 'execute'
[001] ...
<...>
So it seems that testing on box/ test suite on Mac OS is quite unstable now. Maybe all those fails are due to LuaJIT in GC64 mode as in tarantool/tarantool-qa#235?
Tarantool version: Master, 2.2, maybe older.
Reproduce:
Sometimes it passes, sometimes box/iprote_stress hangs. Sometimes box/role fails.