Open lyulka opened 7 months ago
We discussed potential designs for validator set speculation (a solution to issue #16). Our potential designs are essentially variations of adding extra phase(s) to the "phased protocol" to disseminate CommitQC
to a quorum of the old validator set and the new validator set before proceeding to produce blocks using the new validator set.
A couple questions were brought up about the potential design:
DecideQC
, or could we get by without an all-to-all broadcast if we do not forget the Resigning Validator Set? Alternatively, can we replace the all-to-all broadcast with another phase (called "finalize")?DecideQC
(in addition to votes from the new validator set) for safety and liveness? Or, can we make do with a DecideQC
containing votes only from the new validator set.Finally, we agreed that in HotStuff-rs v0.4, any block should be able to change the validator set (i.e., we don't have to wait an "epoch" to change the validator set). This brings challenges to view synchronization.
Present revisions to View Synchronization Design.
We revisited the question of whether the Decide QC needs to contain a quorum of votes from both the resigning validator set and the incoming validator set. We narrowed down the requirements to be the following:
In light of these requirements, we considered a new alternative to having "double-sized" DecideQCs: Commit QCs will contain votes from only the resigning validator set, Decide QCs will contain votes from only the incoming validator set, but blocks following validator set changing blocks will contain a "Merged QC", which includes both a Commit QC and a Decide QC.
This design has a couple benefits over requiring "double-sized" QCs, both benefits arise from the fact that the resigning validator set does not have to participate in the Decide phase:
We identified a liveness problem with the Merged QC design: consider the scenario where a quorum of replicas in vs'
receive MergedQC
but only one replica in vs
receives both CommitQC
and MergedQC
. This replica in vs
will immediately "resign" (i.e., stop nudging CommitQC
and voting Decide
). If f
replicas in vs
are byzantine, this leaves 2f < 2f+1
replicas in vs
that have not yet resigned. This is not enough to produce a new Commit QC. Therefore, in order for the 2f
replicas in vs
to sync to the head of the blockchain (which is now being grown by vs'
), it needs to sync with a replica in vs'
. However, because the sync server is only selected from inside the committed validator set, this is not possible.
The Double-Decide QC design avoids this problems by making it such that DecideQC
guarantees that a quorum of replicas in vs
have committed vs'
. So even the at most f
replicas in vs
that fail to receive CommitQC
and therefore commit vs'
can still reliably select a sync server from inside vs
that is aware of CommitQC
.
We identified ways we could modify the Merged QC design to sidestep the liveness problem, e.g., keeping track of a Locked Validator Set and selecting sync servers from both cvs
and lvs
, but the current thinking is that this is more complex than just going with Double-Decide QC design.
We discussed adding a highest_qc_or_tc
field to the EpochTimeout
type. This helps guarantee liveness in the case that more than f
but less than 2f+1
replicas are in an epoch e + 1
and likewise more than f
but less than 2f+1
replicas are "left behind" in epoch e
by creating a means by which the replicas in epoch e + 1
can bring the left behind replicas up to epoch e + 1
.
@karolinagrzeszkiewicz will update the ParallelChain Protocol v0.6 changelog to reflect the current design.
We discussed a liveness issue with applying the "three consecutive views" commit rule to the phased consensus mechanism for validator-set-updating blocks. In short, if a replica rejects a commitQC
for a block because it does not satisfy the commit rule, it will be unable to broadcast an acceptable nudge or proposal moving forward, since it holds a precommitQC
as its highest_qc
, and the its view has advanced beyond precommitQC.view + 1
. Even if a conflicting block is proposed, it will be rejected because of the locked_view
.
We discussed the following solutions to the above-mentioned issue:
highest_block_justify
as the proposed block's justify
.We chose 3, since 1 may complicate view sync (since views in which a validator-set-updating block is proposed require more time), and 2 fails to guarantee immediacy.
We also discussed how the traditional HotStuff, whether pipelined or not, does not guarantee immediacy, presumably because traditional SMR concerns state updates that have to do with the application, rather than the protocol (like validator set updates). We think that solution 3, which effectively blocks on a certain block height until either (1) a non-validator-set-updating block obtains a genericQC
, or (2) a validator-set-updating block obtains a prepareQC
, precommitQC
, and a commitQC
, may offer a safe and live alternative to the traditional flow of the HotStuff consensus protocol.
I will develop the idea behind solution 3 further, thoroughly analyze its safety and liveness features, and work on adapting to to our Validator Set Speculation Design.
locked_vsu_block
and highest_view_in_which_locked_vsu_block_was_proposed
into a single state variable: locked_prepare_qc
.locked_prepare_qc
and locked_view
into a single state variable: locked_qc
.on_receive_proposal
in the draft depends on whether the block is validator-set-updating or not, validation of the block needs to happen before the block is subjected to safety-checking logic in the new version of the protocol.highest_commit_qc
field to the AdvanceView
message type.This was an overview meeting where we collected ourselves and summarized everything we've agreed to about the design of HotStuff-rs v0.4. These points are summarized in the three sections below.
Nudge(CommitQC)
.TimeoutVote
.TimeoutVote
s to form TimeoutCertificate
s and broadcast AdvanceView(TimeoutCertificate)
.TimeoutVote
.
b. AdvanceView
.TimeoutCertificate
.TODO: hotstuff-rs design that decouples its separate components as much as possible and reasonable. The three components are the HotStuff implementation, the Pacemaker, and the Sync Trigger. The decoupling can be trait-based or thread-based.
execute
in the “/alice” crate because unlike in the other implementation sketch, the execute
function only calls Pacemaker tick
once every iteration.recv
should only be called in execute
, and not in the any of the protocol-specific methods. The protocol-specific methods should only send
. (agreed).on_receive_timeout_vote
and on_receive_advance_view
should have return values. The Algorithm thread will find out about the changes to the Pacemaker struct on calling some kind of query
method. (agreed).view_sync_mode
state variable in the Algorithm struct.AdvertiseBlock
with a Block containing a correct QC "from the future", instead of triggering block sync upon seeing an AdvanceView
message containing a correct QC from the future.tick
method to the BlockSyncClient
struct that checks "progress".We will implement HotStuff-rs v0.4 in a new branch (called "hotstuff_rs_v04"). Development will be split into at least 4 milestones, each milestone corresponding to a PR into the "hotstuff_rs_v04" branch. These milestones will be tracked using the "HotStuff-rs v0.4" GItHub Project.
At the end of development, we will write a user-readable changelog for the v0.4 release.
We discussed the following design sketch for new block sync.
We identified the following 3 components of the block sync client design:
The key idea that links components 1 and 2 is that an AdvertiseBlock
message broadcasted periodically by the sync server serves as a commitment to providing blocks (at least) up to the advertised block, and a sync server can be punished for breaking that commitment by being blacklisted.
In short, we would like to punish servers for sending incorrect blocks and for not sending enough blocks. Ideally, we would like to avoid punishing them for not responding within a given amount of time since this can be due to asynchrony or a benign fault. This is why the design above sends a sync request to all peers, and syncs with the first server that responds, avoiding the problem altogether.
The following issues, questions, and suggestions were raised (and should be addressed by the updated design):
highest_qc.block
, highest block or highest committed block be advertised by a sync server? Newest block height and highest_qc.block.height
are not monotonically increasing because of branch switches. Hence, if an honest server advertises a block, but then the quorum switches to a lower branch and it provides the blocks in the lower branch via BlockSyncResponse
, the server will break its commitment and be incorrectly penalized.AdvertiseBlock
should be split into AdvertiseBlock
and AdvertiseQC
, the latter containing the highest_qc required for sync trigger. This is to avoid overloading AdvertiseBlock
.available_servers
and black_list
. We need to make sure that their sizes are bounded. The associated risks include overflow and byzantine peers overpopulating the available_servers
.sync_request_limit
may differ across sync servers and clients. Hence, in the current design sketch a sync server can be incorrectly punished when it doesn't send as many blocks as it promised but in fact it is because its sync_request_limit
is lower than the client's.block_sync_timeout
: there should not be a limit to how long the syncing process can take, new validators or listeners may need to sync millions of blocks, instead block_sync_timeout
should mean the maximum time the client waits for the server's response.We discussed the updated design for new block sync, where the newest_block_height
field of the AdvertiseBlock
message is replaced with highest_committed_block_height
(which is monotonically increasing), and selecting a new server for every sync request is replaced with selecting the first peer that responds to the first sync request as the sync peer for all subsequent requests. The commitment is also redefined: by sending highest_committed_block_height
in AdvertiseBlock
a sync server commits to providing at least advertise_block.highest_committed_block_height - self.highest_committed_block_height
.
Regarding blacklisting, we concluded that blacklisting servers that do not respond in time may be desirable, since slow servers are not the best sync peers either (not just byzantine servers).
However, we also found that blacklisting may be of little use in protecting against an attacker than can keep adopting new identities (sybil attack), and currently there is nothing in the protocol that prevents them from doing that.
We also discussed the choice between selecting the available sync server that responds first vs. selecting a sync server at random. We found that both approaches expose a significant attack surface:
Sync server that responds first: means that attackers that respond with very few or fake blocks have a higher chance of being selected, since their response message will get to the client faster.
Random sync server: means that for a successful attack the attacker can control a majority of available_servers
by regularly sending AdvertiseBlock
from tons of fake addresses (sybil attack) and hence increase its chance of being selected.
To address this, we considered the following ideas, of which 1 is clearly the safest:
App
API for obtaining identities of all participants in the network that have a stake (or a pool) in the protocol, and selecting available sync servers exclusively from those.
The above concerns bring up a broader question of who should be able to act as a sync servers. Is it fair to let only past/current/candidate validators act as sync servers? Or should we let any reliable participant in the network act as a sync server?
We agreed on the following block sync trigger mechanism:
qc
for an unknown block and qc.view >= cur_view
via an AdvertiseQC{qc}
message from some sync server.highest_qc
" (same rules for highest_qc
update apply as in the HotStuff protocol).We talked about PR #31 which introduces the new code organisation for hotstuff-rs 0.4. We reached the following conclusions regarding changes to the PR:
Collector::collect
and the Collector
trait shall be removed since it doesn't really do anything useful.Certificate::quorum
shall be a method of the ValidatorSet
instead (ValidatorSet::quorum
), and the Certificate trait shall be removed too.TimeoutCertificate
and QuorumCertificate
should be moved to pacemaker::types
and hotstuff::types
respectively.TryFrom
and AsRef
for these newtypes.We also agreed that block_tree
shall be passed to HotStuff
, BlockSyncClient
and Pacemaker
methods (as it is now), rather than being stored as a field of these structs. This is because the only smart pointers that could stored in those fields are the thread-safe pointers like Arc
, because spawning requires the moved data types to implement Send
, and Arc
is expensive and not shared between different threads in our case.
In this meeting we finalized the high-level design of the new block sync mechanism, concretely we decided that:
AdvertiseBlock
and AdvertiseQC
messages shall consist of: (1) the committed validator set, and (2) new validators associated with any speculative block in the block tree.AdvertiseBlock
messages shall be stored in available_sync_servers
(until some expiry date from sending the message). This way we only consider the peers that ntoify us about their availability to be available. As discussed earlier, blacklisted peers should be periodically removed from this set.available_sync_servers
, instead of syncing with the first peer to respond. AdvertiseBlock
vs AdvertiseQC
: we need both, the former for notifying about server availability and commitment to providing blocks up to a given height, and the latter for block sync trigger.start_with_sync
API for the Replica
in addition to the start
method, or a sync function which just like initialize
shall be called before start
.Unrelated to sync, but Alice raised a point that ReplicaSpec::initialize
should be a method of KVStore
rather than ReplicaSpec
, since it initializes the key value store.
I will also update the design in the hack md doc to reflect the above changes.
Changelog
HackMD (Replication Section).
Related issue
parallelchain-io/parallelchain-protocol#10