Seagate / halon

High availability solution
Apache License 2.0
1 stars 0 forks source link

replicated-log: Ensure abandoned requests do not appear later. #171

Closed 1468ca0b-2a64-4fb4-8e52-ea5806644b4c closed 5 years ago

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

Review on Reviewable

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

Merged as https://github.com/tweag/halon/commit/8ed7268

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

Comments from the review on Reviewable.io


replicated-log/src/Control/Distributed/Log/Internal.hs, line 895 [r4] (raw file): Will do. I don't have very good reasons to leave it as it is.


1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: mboes

Comments from the review on Reviewable.io


consensus-paxos/src/Control/Distributed/Process/Consensus/BasicPaxos.hs, line 46 [r4] (raw file): Should include reference to fix, so that we know when to remove this code.


replicated-log/src/Control/Distributed/Log/Internal.hs, line 101 [r4] (raw file): same here.


replicated-log/src/Control/Distributed/Log/Internal.hs, line 134 [r4] (raw file): Would prefer to inline these functions, or if we're really destructuring a triple all the time, then that triple ought to be a record.


replicated-log/src/Control/Distributed/Log/Internal.hs, line 455 [r4] (raw file): typo: "guarantees"


replicated-log/src/Control/Distributed/Log/Internal.hs, line 519 [r4] (raw file): Needs a comment.


replicated-log/src/Control/Distributed/Log/Internal.hs, line 624 [r4] (raw file): Move to top-level, to avoid bloating this epic function further.


replicated-log/src/Control/Distributed/Log/Internal.hs, line 895 [r4] (raw file): Why is the nullipotence indicated separately from the requests? I would instead just pass rs, which includes hints, and compute below whether they all are nullipotent.


1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

For the sake of testing I'm putting here in a single branch all the fixes that are scattered on the three opened pull requests related to replicated-log.

The major updates related to this PR are the introduction of epochs, and the move of the batcher to make it a sub-process of the replica.

Moving the batcher was necessary because ambassadors are being updated with the ProcessIds of replicas, not batchers, yet we could not figure out the pid of the batcher from the pid of the replica. Rather than implementing a protocol to find the batcher, it looked simpler to move the batcher out of the way.

I'm not very convinced of the new shape of the batcher. It works well enough but it is not pretty. I dare not spend more time working on this in isolation though.

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

The later proposal poses an additional challenge since it depends on d-p delivering messages in-order when sent by different ephemeral processes.

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: mboes

Hm, good point. Currently we don't need that only because we're cheating: all requests are synchronous. There are two alternatives to generalize this to async requests:

I have a slight preference for the latter.

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

Since we are using ephemeral processes to make requests, sequence numbers are not needed. If we stop doing this, the client will need an alternative mechanism to distinguish whether an ack belongs to an abandoned or a newer request.

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: mboes

I tend to agree - as noted in #165. I was looking at this earlier today - it strikes me that epochs are a simple straightforward mechanism for avoiding all our problems at once. Can we, in sum total, avoid per client sequence counters?

1468ca0b-2a64-4fb4-8e52-ea5806644b4c commented 9 years ago

Created by: facundominguez

I'm currently working on implementing epochs in requests. They seem necessary after all.