Open AlexHentschel opened 3 months ago
This tech debt is now creating additional complexity and need to resort to formal arguments to cover additional edge cases, as we want to optimize the consensus code for Cruise Control (👉 PR #6379).
The following is a foundational theorem to prove the safety of HotStuff. Beyond HotStuff, we utilize this Theorem in various places outside of consensus to simplify our implementation (e.g. queuing pending blocks for execution, verification, sealing, etc).
Theorem: For each view, there can be at most 1 certified block.
The overall safety argument for the current implementation (already quite involved with [1], [2], [3] ) now becomes even longer as we need to add Cruise Control into the consideration (PR #6379)
Copied issue from proprietary repo: https://github.com/dapperlabs/flow-go/issues/6900
Context
HotStuff's
EventHandler
uses theBlockProducer.MakeBlockProposal
to generate block proposals. At the moment, theBlockProducer
has its own independent logic for signing the proposals it produces. This is problematic for the following reasons:Consensus safety is guaranteed by the
SafetyRules
component. Specifically, this component should have the sole authority to vote. Otherwise, it is very hard to verify that the implementation satisfies all consensus rules.Note that the leader's signature on its own block proposal is technically also a vote (a vote from the proposer for its own block, which the protocol requires to be part of the block header (field
ProposerSigData
). The leader's signature must be structurally identical to a vote, because we also include it in the aggregation process to produce a QC.We had a case, where a node crashed at an unlucky time where it had already produced a block proposal and stored it in the data base, but HotStuff was not aware of it as the node crashed just before the
EventHandler
could persist its updated state.At the moment, subtle implementation details prevent a node under such circumstances from accidentally equivocating (double-proposing, which is a slashable protocol violation). However, these implementation details are not explicitly intended to prevent equivocation, nor is that aspect documented. Furthermore, this is a subtle interplay of various components that is very very hard for an engineer to notice.
ToDo
With a little more work, we can utilize the
EventHandler
'sSafetyRules
to generate the block signature. There is a bit of subtly around protectingSafetyRules
from potentially concurrent access, because we are handing it to external code.Suggestion
SafetyRules
could expose the methodSafetyRules.ProduceVote()
Thereby, we have all consensus compliance checks localized in
SafetyRules
remove method
Signer.CreateProposal
as this is no longer neededEventHandler
providesSafetyRules
as an argument to theBlockProducer.MakeBlockProposal( .. )
call.BlockProducer
is still part of the HoStuff package. Therefore, I think it is fine for it to adhere to the single-threading design of the HotStuff event loop.However, the
module.Builder
lives in a different package, for which reason I think we should try to decouple the logic as much as possible. Specifically, I think we should assume that within themodule.Builder
,SafetyRules
's signing functionality might be potentially accessed concurrently (most general assumption).We can guarantee concurrency safety, by having the
BlockProducer
wrap theSafetyRules
. The wrapper could contain an atomicint
that guarantees proper synchronization like this:BuildOn
call in theBlockProducer
: