uwplse / verdi-raft

An implementation of the Raft distributed consensus protocol, verified in Coq using the Verdi framework
BSD 2-Clause "Simplified" License
183 stars 19 forks source link

Server crashes when trying to produce large packets because of buffer overflow #49

Open palmskog opened 7 years ago

palmskog commented 7 years ago

From @pfons on April 13, 2016 0:11

We found a bug that causes a buffer overflow on the leader when a lagging follower tries to recover. The stack overflow seems to occur within the recursive function “restore_from_log” (Shim.ml) when a very large packet is constructed and before the leader actually tries to send it.

This problem can be reproduced through the following process: a) start 3 servers; b) execute one client request; c) stop a follower server; d) execute many client requests (in our tests, at least 521,932 requests). c) restart the server that was stopped

Here’s a sample output produced by the leader when it crashes:

   [Term 1] Sending 50 entries to 2 (currently have 521932 entries), commitIndex=521882_
   [Term 1] Sending 521881 entries to 3 (currently have 521932 entries), commitIndex=521882_
   [Term 1] Received AppendEntriesReply 50 entries true, commitIndex 521883
  Fatal error: exception Stack overflow

Copied from original issue: uwplse/verdi#37