tarantool / queue

Create task queues, add and take jobs, monitor failed tasks
Other
234 stars 52 forks source link

Flaky/broken test vinyl-040-utubettl.t #194

Closed NickVolynkin closed 1 year ago

NickVolynkin commented 1 year ago

Commit https://github.com/tarantool/tarantool/commit/588170a77c9aba3d0cf898d0134286c6ffd46339 has changed behavior of vinyl. This might have broken the test's expectations. The test almost always fails in the integration job: https://github.com/tarantool/tarantool/actions/runs/3264906596/jobs/5369677748

12:   ---
12:   filename: /home/runner/work/tarantool/tarantool/t/040-utubettl.t
12:   trace:
12:   - line: 245
12:     source: '@builtin/tap.lua'
12:     filename: builtin/tap.lua
12:     what: Lua
12:     namewhat: method
12:     name: test
12:     src: builtin/tap.lua
12:   - line: 0
12:     source: '@/home/runner/work/tarantool/tarantool/t/040-utubettl.t'
12:     filename: /home/runner/work/tarantool/tarantool/t/040-utubettl.t
12:     what: main
12:     namewhat: 
12:     src: /home/runner/work/tarantool/tarantool/t/040-utubettl.t
12:   planned: 3
12:   failed: 1
12:   line: 0
12:   ...
...

Total Test time (real) =  71.78 sec

The following tests FAILED:
     12 - vinyl-040-utubettl.t (Failed)
locker commented 1 year ago

What was changed by commit https://github.com/tarantool/tarantool/commit/588170a77c9aba3d0cf898d0134286c6ffd46339: a read (get, select) outside a transaction now never returns unconfirmed data (committed, but not yet written to WAL). The behavior or a read inside a transaction depends on the transaction isolation level.

oleg-jukovec commented 1 year ago

a read (get, select) outside a transaction now never returns unconfirmed data (committed, but not yet written to WAL).

So is it expected that the update yield now:

https://github.com/tarantool/queue/blob/master/queue/abstract/driver/utubettl.lua#L138-L141

?

locker commented 1 year ago

The new Vinyl behavior is consistent with memtx mvcc. Enabling mvcc in the test breaks the test for memtx as well. (Although the probability of failure seems lower.)

``` vlad@esperanza:~/src/tarantool/queue$ git diff diff --git a/t/040-utubettl.t b/t/040-utubettl.t index 730d911c396a..0d052f930895 100755 --- a/t/040-utubettl.t +++ b/t/040-utubettl.t @@ -12,7 +12,7 @@ local queue_state = require('queue.abstract.queue_state') local qc = require('queue.compat') local tnt = require('t.tnt') -tnt.cfg{} +tnt.cfg{memtx_use_mvcc_engine = true} local engine = os.getenv('ENGINE') or 'memtx' vlad@esperanza:~/src/tarantool/queue$ ctest -V -R memtx-040-utubettl UpdateCTestConfiguration from :/home/vlad/src/tarantool/queue/DartConfiguration.tcl UpdateCTestConfiguration from :/home/vlad/src/tarantool/queue/DartConfiguration.tcl Test project /home/vlad/src/tarantool/queue Constructing a list of tests Done constructing a list of tests Updating test list for fixtures Added 0 tests to meet fixture requirements Checking test dependency graph... Checking test dependency graph end test 11 Start 11: memtx-040-utubettl.t 11: Test command: /home/vlad/bin/tarantool "/home/vlad/src/tarantool/queue/t/040-utubettl.t" 11: Environment variables: 11: ENGINE=memtx 11: LUA_PATH=/home/vlad/src/tarantool/queue/?.lua;/home/vlad/src/tarantool/queue/?/init.lua; 11: Test timeout computed to be: 10000000 11: TAP version 13 11: 1..17 11: ok - box started 11: ok - queue is loaded 11: ok - test tube created 11: ok - tube.name 11: ok - tube.type 11: # Utubettl statistics 11: 1..13 11: ok - tasks.taken 11: ok - tasks.buried 11: ok - tasks.ready 11: ok - tasks.done 11: ok - tasks.delayed 11: ok - tasks.total 11: ok - calls.delete 11: ok - calls.ack 11: ok - calls.take 11: ok - calls.kick 11: ok - calls.bury 11: ok - calls.put 11: ok - calls.release 11: # Utubettl statistics: end 11: ok - Utubettl statistics 11: # Easy put/take/ack 11: 1..12 11: ok - task was put 11: ok - task was put 11: ok - task was taken 11: ok - task status 11: ok - task.data 11: ok - second task was not taken (the same tube) 11: ok - task was acked 11: ok - task status 11: ok - task.data 11: ok - task2 was taken 11: ok - task.data 11: ok - task.status 11: # Easy put/take/ack: end 11: ok - Easy put/take/ack 11: # ttr put/take 11: 1..3 11: ok - put ttr task 11: ok - take this task 11: not ok - Ready state returned after one second 11: --- 11: filename: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: trace: 11: - line: 89 11: source: '@/home/vlad/src/tarantool/queue/t/040-utubettl.t' 11: filename: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: what: Lua 11: namewhat: local 11: name: fun 11: src: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: - line: 218 11: source: '@builtin/tap.lua' 11: filename: builtin/tap.lua 11: what: Lua 11: namewhat: method 11: name: test 11: src: builtin/tap.lua 11: - line: 0 11: source: '@/home/vlad/src/tarantool/queue/t/040-utubettl.t' 11: filename: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: what: main 11: namewhat: 11: src: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: line: 0 11: expected: r 11: got: t 11: ... 11: # ttr put/take: end 11: not ok - failed subtests 11: --- 11: filename: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: trace: 11: - line: 245 11: source: '@builtin/tap.lua' 11: filename: builtin/tap.lua 11: what: Lua 11: namewhat: method 11: name: test 11: src: builtin/tap.lua 11: - line: 0 11: source: '@/home/vlad/src/tarantool/queue/t/040-utubettl.t' 11: filename: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: what: main 11: namewhat: 11: src: /home/vlad/src/tarantool/queue/t/040-utubettl.t 11: planned: 3 11: failed: 1 11: line: 0 11: ... 11: # ack in utube 11: 1..8 11: ok - task was put 11: ok - task was put 11: ok - task was taken 11: ok - task.data 11: ok - state was not changed 11: ok - second task was taken 11: ok - task.data 11: ok - state was changed 11: # ack in utube: end 11: ok - ack in utube 11: # bury in utube 11: 1..8 11: ok - task was put 11: ok - task was put 11: ok - task was taken 11: ok - task.data 11: ok - state was not changed 11: ok - second task was taken 11: ok - task.data 11: ok - state was changed 11: # bury in utube: end 11: ok - bury in utube 11: # instant bury 11: 1..1 11: ok - task is buried 11: # instant bury: end 11: ok - instant bury 11: # release in utube 11: 1..8 11: ok - task was put 11: ok - task was put 11: ok - task was taken 7 11: ok - task.data 11: ok - state was not changed 11: ok - first task was taken again 11: ok - task.data 11: ok - state was changed 11: # release in utube: end 11: ok - release in utube 11: # release[delay] in utube 11: 1..8 11: ok - task was put 11: ok - task was put 11: ok - task was taken 9 11: ok - task.data 11: ok - state was not changed 11: ok - second task was taken 11: ok - task.data 11: ok - state was changed 11: # release[delay] in utube: end 11: ok - release[delay] in utube 11: # if_not_exists test 11: 1..2 11: ok - if_not_exists if tube exists 11: ok - if_not_exists if tube doesn't exists 11: # if_not_exists test: end 11: ok - if_not_exists test 11: # read_only test 11: 1..7 11: ok - queue state changed to waiting 11: ok - check that background fiber is canceled 11: ok - check that background fiber object is cleaned 11: ok - check that task wasn't moved to ready state 11: ok - queue state changed to running 11: ok - check that background fiber started 11: ok - check that we can take task 11: # read_only test: end 11: ok - read_only test 11: # ttl after delay test 11: 1..2 11: ok - check TTL after release 11: ok - check TTR after release 11: # ttl after delay test: end 11: ok - ttl after delay test 11: # Get tasks by state test 11: 1..2 11: ok - Check tasks count in a ready state 11: ok - Check tasks count in a taken state 11: # Get tasks by state test: end 11: ok - Get tasks by state test 11: # failed subtest: 1 1/1 Test #11: memtx-040-utubettl.t .............***Failed 3.67 sec 0% tests passed, 1 tests failed out of 1 Total Test time (real) = 3.67 sec The following tests FAILED: 11 - memtx-040-utubettl.t (Failed) Errors while running CTest ```
locker commented 1 year ago

a read (get, select) outside a transaction now never returns unconfirmed data (committed, but not yet written to WAL).

So is it expected that the update yield now:

https://github.com/tarantool/queue/blob/master/queue/abstract/driver/utubettl.lua#L138-L141

?

Update in Vinyl may yield anyway. It could yield before the aforementioned commit.

locker commented 1 year ago

Switching single-statement reads to the 'read-committed' isolation level (may be done only at compile time) eliminates the issue.

Patch for Tarantool ```diff diff --git a/src/box/vy_tx.c b/src/box/vy_tx.c index 7f2af76888d4..79e143c61d33 100644 --- a/src/box/vy_tx.c +++ b/src/box/vy_tx.c @@ -341,7 +341,7 @@ vy_tx_create(struct vy_tx_manager *xm, struct vy_tx *tx) tx->write_set_version = 0; tx->write_size = 0; tx->xm = xm; - tx->isolation = TXN_ISOLATION_READ_CONFIRMED; + tx->isolation = TXN_ISOLATION_READ_COMMITTED; tx->state = VINYL_TX_READY; tx->is_applier_session = false; tx->read_view = (struct vy_read_view *)xm->p_global_read_view; ```