canonical / dqlite

Embeddable, replicated and fault-tolerant SQL engine.
https://dqlite.io
Other
3.88k stars 216 forks source link

Assertion failure in transport__write #427

Closed cole-miller closed 2 years ago

cole-miller commented 2 years ago

This is from Jepsen again, see the CI run. It's this assertion:

https://github.com/canonical/dqlite/blob/4fde1a89b4e230a1546c067c681eb67aff6bf846/src/lib/transport.c#L178

MathieuBordere commented 2 years ago

Still being hit in https://github.com/canonical/jepsen.dqlite/actions/runs/3384785679/jobs/5622162507, requires more careful investigation. Unless Github is caching the dqlite repo checkout somehow, but I don't think so.

MathieuBordere commented 2 years ago
(gdb) bt
...
#9  0x00007fb780cd7729 in __assert_fail_base (fmt=0x7fb780e6d588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x7fb780ecf0af "t->write_cb == NULL", 
    file=0x7fb780ecf020 "src/lib/transport.c", line=178, function=<optimized out>) at assert.c:92
#10 0x00007fb780ce8fd6 in __GI___assert_fail (assertion=assertion@entry=0x7fb780ecf0af "t->write_cb == NULL", file=file@entry=0x7fb780ecf020 "src/lib/transport.c", 
    line=line@entry=178, function=function@entry=0x7fb780ecf0e0 <__PRETTY_FUNCTION__.9778> "transport__write") at assert.c:101
#11 0x00007fb780ec2824 in transport__write (t=<optimized out>, buf=<optimized out>, cb=<optimized out>) at src/lib/transport.c:178
#12 0x00007fb780eb6f8e in gateway_handle_cb (req=<optimized out>, status=<optimized out>, type=0) at src/conn.c:86
#13 0x00007fb780eb9fc4 in failure (req=0x7fb744003008, code=<optimized out>, message=<optimized out>) at src/gateway.c:182
#14 0x00007fb780ebbd2b in handle_exec_sql_next (req=0x7fb744003008, done=done@entry=false) at src/gateway.c:662
#15 0x00007fb780ebbf6a in execSqlBarrierCb (barrier=0x7fb744002f70, status=0) at src/gateway.c:704
#16 0x00007fb780ec096f in raftBarrierCb (req=0x7fb744002f80, status=<optimized out>) at src/leader.c:430
#17 0x00007fb780ad681d in applyBarrier (index=388, r=0xcc8960) at src/replication.c:1368
#18 replicationApply (r=r@entry=0xcc8960) at src/replication.c:1605
#19 0x00007fb780ad8057 in replicationUpdate (r=r@entry=0xcc8960, server=0x7fb74404b380, result=result@entry=0x7fb744037b18) at src/replication.c:756
#20 0x00007fb780ad3852 in recvAppendEntriesResult (r=r@entry=0xcc8960, id=1430860652164524365, address=<optimized out>, result=0x7fb744037b18)
    at src/recv_append_entries_result.c:64
#21 0x00007fb780ad2acb in recvMessage (message=0x7fb744037b00, r=0xcc8960) at src/recv.c:45
#22 recvCb (io=<optimized out>, message=0x7fb744037b00) at src/recv.c:108
#23 0x00007fb780ae434c in uvFireRecvCb (s=s@entry=0x7fb744037aa0) at src/uv_recv.c:190
#24 0x00007fb780ae4868 in uvServerReadCb (stream=<optimized out>, nread=<optimized out>, buf=<optimized out>) at src/uv_recv.c:294
...

relevant part of trace

LIBDQLITE 1667470734508422514 handle_transfer:1216 handle transfer
LIBRAFT   1667470734508432814 src/client.c:407 transfer to 1430860652164524365
LIBRAFT   1667470734508437714 src/uv_send.c:218 connection available -> write message
LIBRAFT   1667470734509053617 src/recv_append_entries_result.c:24 self:3297041220608546238 from:1430860652164524365@10.2.1.13:8081 last_log_index:388 rejected:0 term:1
LIBRAFT   1667470734509063117 src/replication.c:1662 new commit index 388
LIBDQLITE 1667470734509066817 raftBarrierCb:414 raft barrier cb status 0
LIBDQLITE 1667470734509069717 execSqlBarrierCb:687 exec sql barrier cb status:0
LIBDQLITE 1667470734509072317 handle_exec_sql_next:607 handle exec sql next
LIBDQLITE 1667470734509132017 leader__exec:389 leader exec
LIBDQLITE 1667470734509137617 leader__barrier:435 leader barrier
LIBDQLITE 1667470734509140717 leader__barrier:447 raft barrier failed 7
LIBDQLITE 1667470734509143317 handle_exec_sql_cb:588 handle exec sql cb status 7
app: src/lib/transport.c:178: transport__write: Assertion `t->write_cb == NULL' failed.