ktoso / akka-raft

A toy project implementing RAFT on top of Akka Cluster (not prod ready)
http://blog.project13.pl
Apache License 2.0
280 stars 42 forks source link

Should only delete entries if they conflict with a new one #66

Closed colin-scott closed 9 years ago

colin-scott commented 9 years ago

Found through fuzz testing:

Suppose a follower with an empty log receives an AppendEntries containing 2 entries. The follower appends these to its log.

Then the follower subsequently receives an AppendEntries containing only the first of the previous 2 entries. [This message was delayed].

Currently, the follower will inadvertently delete the 2nd entry from its log.

This is not just a performance issue: it can cause raft to violate the "Leader Completeness" safety property: the leader was under the impression that the follower had 2 entries in its log, and may have decided to commit both entries. If another leader then overtakes it, it will not necessarily have both committed entries in its log.

ktoso commented 9 years ago

Great catch as always, thanks @colin-scott! I might actually want to fix this right away, should be easily testable in isolation.

ktoso commented 9 years ago

This is actually a bit weird, so allow me to re-cap the exact case the fuzz test found. Do you mean that the same Leader sends those messages (same Term)?

I don't think there is re-delivery of [1] implemented, and at-worst the [1,2] could be re-delivered...

Or are we talking about 2 different leaders sending those 2 messages? Then it seems like the "longest log wins" should be applied, no? I'll dig into it (need to finally get back to hakking on this project, so that's a good reason to get me remember the impl and algorithm ;))

colin-scott commented 9 years ago

You're absolutely right that this would only be triggered if you moved to UDP instead of TCP -- TCP guarentees in-order delivery, and this bug depends on a delayed message falling behind a later message between the same sender and receiver.

colin-scott commented 9 years ago

It's possible that you could trigger this with 2 different leaders? Not sure. But my fuzz test only involved a single leader sending the AppendEntries messages.

*Is it Akka that guarantees in-order delivery, or is it TCP?

colin-scott commented 9 years ago

Interesting, akka does appear to guarentee in-order delivery:

http://doc.akka.io/docs/akka/snapshot/general/message-delivery-reliability.html#Discussion__Message_Ordering

I find that to be an interesting design choice. If you ever support UDP akka-remoting, that guarentee comes with a non-trivial performance cost.

colin-scott commented 9 years ago

Ah, and in fact, there is a discussion on that same page about possibly removing that guarentee in the future:

http://doc.akka.io/docs/akka/snapshot/general/message-delivery-reliability.html#How_does_Local_Ordering_relate_to_Network_Ordering

colin-scott commented 9 years ago

Well, my fuzzer does support an in-order delivery constraint, so I guess I'll switch to that for the time being.

ktoso commented 9 years ago

Correct - currently remoting is over TCP, and the ordering guarantee of "direct point to point communication" holds there. We are not currently looking into supporting UDP directly (however are very tempted by https://github.com/real-logic/Aeron which does works over UDP however is viewed as a log as well, and that log is ordered).

For reporting problems in akka-raft let's use the in-order delivery mode then, which also means we can close this specific ticket right? :)

ktoso commented 9 years ago

Hi @colin-scott, we talked about the docs and current implementation in the 2.3 and 2.4 series of Akka. The doc is slightly out of date - we'll update it soon https://github.com/akka/akka/issues/18209

The guarantee holds mostly because of TCP, however since there can be re-established connections Akka does actually have to to a bit more than just rely on TCP (if a new connection comes in, while the old one is still not closed (i.e. "stale")) we ignore messages from the stale connection. Yes, it wouldn't hold for UDP.

colin-scott commented 9 years ago

Yeah, feel free to close this issue

ktoso commented 9 years ago

Yup, done :)