Closed ffakenz closed 2 weeks ago
Our current network interface is limiting us: [...] It
broadcast
s a message to the entire Hydra cluster, including the sender node. This is problematic because the sender assumes that delivery is guaranteed to the rest of the party
Indeed, the broadcast
is operating under these assumptions and semantics as also outlined in the specification (e.g. see https://hydra.family/head-protocol/unstable/assets/files/hydra-spec-09c867d2d94685906cbc7a74873f9de5.pdf#subsection.6.1).
This proposal suggests shifting all responsibilities to the L2 protocol, resulting in a simpler networking stack that solely handles broadcasting messages without guarantees.
This statement conflicts with:
Have the networking layer handle retry-broadcasting and consumption based on delivered in-memory messages.
So I'm not entirely sure what the proposal means so let me rephrase (using terms of this book):
You are proposing to lower the requirements of broadcast
to not be a "reliable broadcast" primitive, but merely of "best-effort broadcast" semantics. Consequently, the off-chain protocol will need to change to accommodate of potential non-delivery in fail-recovery
scenarios.
Is this what this is about?
I was wrong.
The protocol is not a 2PC: it has more than 2 phases.
These are the following aspects about the protocol that prevent use from going on this direction:
it is expected that anyone can propose a new ReqTx at any time, but its the leader who gets to decide what ReqTxs has been seen and what not, so the snapshot in flight for the current round will only contain txs seen by the leader.
this makes the protocol to halt as peers will never sign a snapshot containing a tx they did not seen and validated (missing ReqTx).
These points forces us to rethink where to go. A good alternative to when 2PC is not good enough is the SAGA pattern. This involves making the leader responsible for opening and closing a new round with Ack or Abort, in cooperation with the rest. That makes the leader a temporal stateful coordinator, having to deal with retries and timeouts, ensuring all peers cooperate during every phase of the round until it gets completed.
This reminds me of the original Hydra paper, where more communication rounds were involved.
From above I started exploring the following approach:
The evolution of the HeadState
by the worker nodes will be done in Rounds
.
Rounds
are coordinated by the worker node elected as Round Leader
.
Other nodes act as Followers
, and they cooperate with the leader.
Followers
vote on proposals and the Round Leader
collects them.
When a proposal is accepted by all worker nodes, the Round Leader
continues with the next phase in the Round
.
Followers
can always reject proposals.
Round Leaders
will retry and keep timeouts to guarantee complete gathering of votes.
behaves differently depending on the node's role
- round leader
- collects votes:
in-memory
- keeps track of the phase of the round
- on failure it restarts the round from step 1
- manages retry and timeouts mechanisms using udp
- round follower
- round votes are
persisted
- votes are sent back with udp oh every phase step confirmation
leader: broadcast StartRound
leader: retry unicast
StartRound
on timeout for a missing peer
leader: collects "votes" for each round "proposal"
peers: broadcast locally stored ReqTx{tx}
(proposal)
peers: keep round proposals persistent
peers: on ReqTx
proposal received => unicast leader ValidTx{tx_id}
| InvalidTx{tx_id}
(votes)
peers: keep round votes persistent peers: prune previous round votes persisted ???
leader: on ReqTx
proposal accepted by ALL => broadcast ReqSn{list-tx_id}
(proposal)
leader: retry unicast
ReqSn
on timeout for a missing peer
leader: collects "votes" for the round SN "proposal"
peers: on ReqSn
proposal received => unicast leader AckSn{sig}
| AbortSn{sig}
(votes)
peers: keep round votes persistent
leader: on ReqSn
proposal accepted by ALL => broadcast CloseRound{multi-sig, next-leader-id}
(proposal)
leader: retry unicast
CloseRound
on timeout for a missing peer
peers: on CloseRound
proposal received => unicast leader Ok
| Invalid
(votes)
peers: keep round votes persistent
peer: is unblock and can start accepting and store new ReqTx
proposals locally
[note]: in case of disagreement the head must be manually unstuck or closed.
same as with a peer never coming back online.
Reference: https://en.wikipedia.org/wiki/Two-phase_commit_protocol
Current Situation
Our current network interface is limiting us:
It broadcasts a message to the entire Hydra cluster, including the sender node.
This is problematic because the sender assumes that delivery is guaranteed to the rest of the party. But what if some peers were offline at that time?
Let’s call the set of peers that are offline during a message delivery the set of
unreliable-parties
.The current implementation causes the set of
unreliable-parties
to lag behind and prevents them from reaching an agreement with the rest.Opportunity
There are several protocols focused on achieving consensus in distributed systems using a best-effort Reliable/Atomic Broadcast:
All of them provide networking capabilities to maintain eventual consistency in distributed state machines (like Hydra).
One of the most important capabilities is allowing the event log to evolve in the presence of
unreliable-parties
as long as the majority of nodes agree.Unfortunately, this is a disadvantage for us because our L2 protocol requires all peers to sign the next snapshot to continue operating.
Note that we could arguably adapt the application logic to prevent the state-machine logs from evolving whenever the event log does. This would lead to pushing complexity to the application layer, thus not making any of these an out-of-the-box solution for us.
Luckily, there is still one protocol left which, compared to the others, enforces that all participants in a transaction either commit or abort together:
Two-Phase Commit (2PC)
.This opportunity seems to be a fit for our use case, but it forces us to be careful about:
In a "normal execution" (i.e., when no failure occurs, which is typically the most frequent situation), the protocol consists of two phases:
Proposal
There seems to be a correlation between the 2PC phases and our L2 protocol, doesn't there?
The idea is basically to capitalize on the fact that our L2 protocol is already a 2PC protocol!
This proposal suggests shifting all responsibilities to the L2 protocol, resulting in a simpler networking stack that solely handles broadcasting messages without guarantees.
How (TBD):
SnapshotConfirmed
events, and no intermediate events.