There is an error in design in the mencius message processing logic.
`select {
case propose := <-r.ProposeChan:
//got a Propose from a client
dlog.Printf("Proposal with id %d\n", propose.CommandId)
r.handlePropose(propose)
break
case skipS := <-r.skipChan:
skip := skipS.(*menciusproto.Skip)
//got a Skip from another replica
dlog.Printf("Skip for instances %d-%d\n", skip.StartInstance, skip.EndInstance)
r.handleSkip(skip)
case prepareS := <-r.prepareChan:
prepare := prepareS.(*menciusproto.Prepare)
//got a Prepare message
dlog.Printf("Received Prepare from replica %d, for instance %d\n", prepare.LeaderId, prepare.Instance)
r.handlePrepare(prepare)
break
case acceptS := <-r.acceptChan:
accept := acceptS.(*menciusproto.Accept)
//got an Accept message
dlog.Printf("Received Accept from replica %d, for instance %d\n", accept.LeaderId, accept.Instance)
r.handleAccept(accept)
break
case commitS := <-r.commitChan:
commit := commitS.(*menciusproto.Commit)
//got a Commit message
dlog.Printf("Received Commit from replica %d, for instance %d\n", commit.LeaderId, commit.Instance)
r.handleCommit(commit)
break
case prepareReplyS := <-r.prepareReplyChan:
prepareReply := prepareReplyS.(*menciusproto.PrepareReply)
//got a Prepare reply
dlog.Printf("Received PrepareReply for instance %d\n", prepareReply.Instance)
r.handlePrepareReply(prepareReply)
break
case acceptReplyS := <-r.acceptReplyChan:
acceptReply := acceptReplyS.(*menciusproto.AcceptReply)
//got an Accept reply
dlog.Printf("Received AcceptReply for instance %d\n", acceptReply.Instance)
r.handleAcceptReply(acceptReply)
break`
In Mencius, each node should have FIFO channels, which is correctly implemented in this implementation. However, upon receiving a message from a node, that message is pushed to a channel that is specific to that message type. Then the messages are processed in the receiver side in non-FIFO method. The following is an example where this design approach breaks safety.
Assume that there are 3 nodes; A, B and C. Node A first sends a Accept message and then later sends a Propose message. Now both these messages are received by B in the order sent by A. However, upon receiving the two messages, Node B will push these messages to two separate queues. Another thread scans each channel using a select polling mechanism.
Now there is a violation of the protocol if the Propose message is first processed by B (which is possible in this design). This is a problem in mencius because, from messages each node derives piggy backed messages, hence the order of processing messages should be strictly similar to the sender's order.
A fix for this would be to have a single channel for each type of replica messages.
There is an error in design in the mencius message processing logic.
`select {
In Mencius, each node should have FIFO channels, which is correctly implemented in this implementation. However, upon receiving a message from a node, that message is pushed to a channel that is specific to that message type. Then the messages are processed in the receiver side in non-FIFO method. The following is an example where this design approach breaks safety.
Assume that there are 3 nodes; A, B and C. Node A first sends a Accept message and then later sends a Propose message. Now both these messages are received by B in the order sent by A. However, upon receiving the two messages, Node B will push these messages to two separate queues. Another thread scans each channel using a select polling mechanism.
Now there is a violation of the protocol if the Propose message is first processed by B (which is possible in this design). This is a problem in mencius because, from messages each node derives piggy backed messages, hence the order of processing messages should be strictly similar to the sender's order.
A fix for this would be to have a single channel for each type of replica messages.
Thanks