tarantool / tarantool-qa

QA related issues of Tarantool
3 stars 0 forks source link

test: flaky fail app/fio.test.lua on temporary path creation #124

Open avtikhon opened 3 years ago

avtikhon commented 3 years ago

Tarantool 2.9.0-57-gea0b126ff

Issue found on the sane runner host but at different run time:

  1. https://github.com/tarantool/tarantool/runs/2730001831 Runner name: 'ghacts-2-8-90-3-1' failed 17 hours ago in 15m 23s

  2. https://github.com/tarantool/tarantool/runs/2731235651 Runner name: 'ghacts-2-8-90-3-1' failed 15 hours ago in 23m 38s

as:

[013] app/fio.test.lua                                                [ fail ]
[013] 
[013] Test failed! Result content mismatch:
[013] --- app/fio.result    Wed Jun  2 18:31:11 2021
[013] +++ /tmp/tnt/rejects/app/fio.reject   Wed Jun  2 20:07:24 2021
[013] @@ -1478,7 +1478,8 @@
[013]  ...
[013]  fio.mkdir(tmpdir)
[013]  ---
[013] -- true
[013] +- false
[013] +- 'fio: File exists'
[013]  ...
[013]  os.setenv('TMPDIR', tmpdir)
[013]  ---
[013] 

Reproducer:

cd test
cat >a.rep <<EOF
- [app/fio.test.lua, null]
- [app/fio.test.lua, null]
EOF
./test-run.py --reproduce a.rep

Found that the root cause of the issue was sub-directory in 'vardir' path w/o any permissions to anyone

ls -la /tmp/tnt/020_replication/
total 32
drwxrwxr-x   3 ubuntu ubuntu  4096 Jun  3 09:12 .
drwxrwxr-x 452 ubuntu ubuntu 20480 Jun  3 11:44 ..
-rw-rw-r--   1 ubuntu ubuntu  2375 Jun  3 01:23 fast_replica.lua
d---------   5 ubuntu ubuntu  4096 Apr 13 10:50 master

That is why test-run didn't remove it at testing starts. To avoid of it already exists the issue tarantool/test-run#206

Totktonada commented 3 years ago

NB: fio.tempdir() should be free of races of this kind.

Totktonada commented 3 years ago

Found that the root cause of the issue was sub-directory in 'vardir' path w/o any permissions to anyone

<...>
d---------   5 ubuntu ubuntu  4096 Apr 13 10:50 master

May it be relevant to https://github.com/tarantool/tarantool/issues/1211?

Buristan commented 11 months ago

This issue strikes again: https://github.com/tarantool/tarantool/actions/runs/6940678006/job/18880062327?pr=9386#step:6:11613

Also, I can't reproduce the issue locally with the given script:

test/test-run.py --reproduce a.rep
Started test/test-run.py --reproduce a.rep
Running in parallel with 8 workers

Timeout options:
-------------------
SERVER_START_TIMEOUT:     90
REPLICATION_SYNC_TIMEOUT: 100
TEST_TIMEOUT:             110
NO_OUTPUT_TIMEOUT:        120

Cannot read "a.rep" passed as --reproduce argument
cat a.rep
- [app/fio.test.lua, null]
- [app/fio.test.lua, null]