brando90 / RoadRunner

High performant, fault tolerant, persistent, key value store
0 stars 0 forks source link

Deciding current epoch number #19

Open pcattori opened 10 years ago

pcattori commented 10 years ago

How should our servers decide their current epoch number?

pcattori commented 10 years ago

If information from the paxos log is required as the roadrunner server executes parts of it, we should make an API method to contact its MultiPaxos peer and update its epoch number.

Maybe epoch numbers should be piggy-backed.

brando90 commented 10 years ago

When a leader is considered dead, then ideally, thats when the epoch number should increase. However, how does it know what the "old" leaders epoch number was? Does it even matter if its not increased enough? it should, because otherwise, if its lower then the acceptors wouldn't accept his prepare_epoch...

pcattori commented 10 years ago

@brando90 Did you mean leader instead of server? Also if its lower, the acceptors WON'T accept his prepares (batched inside the prepare_epoch)

Remember that prepare_epochs are not accepted or rejected. They are replied to with a map of responses (each response being an accept or reject with the pertinent information)

brando90 commented 10 years ago

@pcattori hu? I thought that we were sending "nacks" to inform the server that he isn't the leader. So we are sending rejects.

But what about my initial question, WHEN are we increasing epoch number and making sure future acceptors will "recognize" the new leader? How are we guaranteeing that increasing the epoch number actually does anything?

Like what if a really old leader re-becomes the new leader? Say he was a leader at e=29 and then for whatever reason missed epochs from 30 to 400. But is chosen to be the new leader at epoch 401, but he only knows of epoch 29. He has to be informed of the correct epoch round for him to become the correct/effective leader. Right? How are we enforcing that? I guess the question is, how did he find out he was suppose to be the new leader at round 401?

pcattori commented 10 years ago

@brando90 For me acks/nacks are not part of a reply. But the way we have been discussing it ack/nack = ok/reject (in the code ok = true / ok = false).

Whenever a proposer receives a reject from an acceptor, it should receive the highest proposal number seen by that acceptor (in basic paxos from 3a). To extend this, for a prepare_epoch(e, seq), the proposer will receive the highest proposal number seen by all acceptors at a sequence >= seq. Simply take the max of these, increment that max and set that as the epoch number. :)