hyperledger-labs / SmartBFT

Implementation of the SmartBFT consensus library (https://arxiv.org/abs/2107.06922)
Apache License 2.0
97 stars 27 forks source link

Quorum calculation #83

Closed pfi79 closed 5 years ago

pfi79 commented 5 years ago
  1. there is no check for n <4, for n = 3 q (quorum) returns 2
  2. the quorum is calculated by the formula (n + f + 1) / 2, and not by the formula 2f + 1; see Table (q1 - quorum by the formula (n + f + 1) / 2, q2 -> 2f + 1, q3 -> n-f, q4 -> (n + f) / 2 + 1)
n f q1 q2 q3 q4
3 0 2 - - -
4 1 3 3 3 3
5 1 4 3 4 4
6 1 4 3 5 4
7 2 5 5 5 5
8 2 6 5 6 6
9 2 6 5 7 6
10 3 7 7 7 7
  1. "From Byzantine Consensus to BFT State Machine Replication: A Latency-Optimal Transformation". Algorithm 3 There is a comparison with: f (line 15), 2f (line 19), n - f (line 28, 35)
C0rWin commented 5 years ago

Thanks for your comments.

NOTE for N=3f+1 substituting into (N + f + 1)/2 gives you exactly 2f+1.

pfi79 commented 5 years ago

Thanks.

  1. "For any arbitrary value of N, assuming f Byzantine failures and quorum of size q you would like to make sure that intersection will contain at least one non-faulty node.". Consequently, faulty nodes may participate in the quorum. Then the number of non-faulty nodes in the quorum is >= (n + f + 1) / 2 - f = (n - f + 1) / 2. You expect (n + f + 1) / 2 messages only from the non-faulty nodes. You send only the correct messages, but this does not guarantee that only the correct messages will be received. And the sending node may consider itself non-faulty, but not as such.

  2. Yes you are right, I meant it to him. What document or algorithm will you implement State Machine Replication?

C0rWin commented 5 years ago

I'm not sure, what are you trying to imply. Given f number of faulty nodes, non-faulty is indeed n-f. I do not really care about number of non-faulty nodes in the quorum, all I care that looking into intersection I will find at least one honest. And you are not correct, (n + f + 1)/2 is not only from correct nodes, it's from (n + f + 1)/2 arbitrary nodes. But since quorum size satisfies inequality 2q >= n + f + 1, I'm guaranteed to find at least one honest node, intersecting pre-prepare and prepare phases quorums, therefore if someone tries to lie me during either of these stages algorithm will detect it.

For state machine replication we are using variation of PBFT.

pfi79 commented 5 years ago

Thank

  1. Here is the code "SmartBFT-Go / consensus / blob / master / internal / bft / view.go"

    for collectedDigests < v.Quorum-1 {
        select {
        case <-v.abortChan:
            return
        case vote := <-v.prepares.votes:
            prepare := vote.GetPrepare()
            if prepare.Digest != expectedDigest {
                // TODO is it ok that the vote is already registered?
                v.lock.RLock()
                seq := v.ProposalSequence
                v.lock.RUnlock()
                v.Logger.Warnf("Got wrong digest at processPrepares for prepare with seq %d, expecting %v but got %v, we are in seq %d", prepare.Seq, expectedDigest, prepare.Digest, seq)
                continue
            }
            collectedDigests++
        }
    }

    and

    for len(signatures) < v.Quorum-1 {
        select {
        case <-v.abortChan:
            return nil
        case vote := <-v.commits.votes:
            commit := vote.GetCommit()
            if commit.Digest != expectedDigest {
                v.Logger.Warnf("Got wrong digest at processCommits for seq %d", commit.Seq)
                continue
            }
    
            err := v.Verifier.VerifyConsenterSig(types.Signature{
                Id:    commit.Signature.Signer,
                Value: commit.Signature.Value,
                Msg:   commit.Signature.Msg,
            }, *proposal)
            if err != nil {
                v.Logger.Warnf("Couldn't verify %d's signature: %v", commit.Signature.Signer, err)
                continue
            }
    
            signatures[commit.Signature.Signer] = types.Signature{
                Id:    commit.Signature.Signer,
                Value: commit.Signature.Value,
                Msg:   commit.Signature.Msg,
            }
        }
    }

    In these examples, the received message counter is incremented only if it is the correct message.

  2. You can find out which variation of PBFT you are using for state machine replication. I thought it was "smart-bft".

C0rWin commented 5 years ago

In these examples, the received message counter is incremented only if it is the correct message.

Right, if no required amount of signatures is collected we will hit timeout. Are you pointing to exact problem or suggesting an optimization? What are the question you are asking here?

I thought it was "smart-bft".

Indeed we use BFT-Smart, while note that BFT-Smart refers to BFT algorithm as a black box, specifically it uses VP-Consensus, without stating exact implementation of the consensus algorithm, but rather suggesting options.

Here is the citation from the paper:

The general architecture of a replica is described in Figure 2. MOD-SMART is built on top of a reliable and authen- ticated point-to-point communication substrate and a VP- Consensus implementation.

where

VP-Consensus implementation offers the following interface:

  • VP-Propose(i, l, γ, v): proposes a value v in consensus instance i, with initial leader l and predicate γ;
  • VP-Decide(i,v,Γ): triggered when value v with proof Γ is decided in consensus instance i;
  • VP-Timeout(i, l): used to trigger a timeout in the con- sensus instance i, and appoint a new leader process l.

I would strongly recommend to thoroughly read the referenced paper to understand all the details.

pfi79 commented 5 years ago

Thank,

Are you pointing to exact problem or suggesting an optimization? What are the question you are asking here?

I believe that there is a contradiction between quorum theory (it's from (n + f + 1)/2 arbitrary nodes) and practice (if no required amount of signatures is collected we will hit timeout. - i.e. correct messages).

The general architecture of a replica is described in Figure 2. MOD-SMART ....

this quote from this document - "From Byzantine Consensus to BFT State Machine Replication: A Latency-Optimal Transformation".

Sorry, small addition: BFT-Smart refers to BFT algorithm as a gray box

The result is a transformation that adds at most one communication step to implement total order broadcast, thus matching the number of communication steps of PBFT at the cost of using such gray-box consensus abstraction.

C0rWin commented 5 years ago

I believe that there is a contradiction between quorum theory (it's from (n + f + 1)/2 arbitrary nodes) and practice (if no required amount of signatures is collected we will hit timeout. - i.e. correct messages).

Fortunately for us there is no contradiction. Given n=3f+1, substituting into formula (3f + 1 + f + 1)/2 we will reach desired 2f+1. Now, why 2f+1 is not working for general case, consider example of n=9. We would to let q be of the minimal possible size such that intersection between pre-prepare and prepare stages will have at least 1 honest node, assuming f failures. Obviously q < 5 won't work since at worst case there won't be any intersection. Let's take a look at q=5 as suggested by 2f+1 formula, there must be at least one node in intersection, but for n=9 we have f=2 which means q=5 is not enough, hence 2f+1 won't work. Now, as suggested by (n + f + 1)/2 let take q=6, hence we will get 3 nodes in intersection, which guaranties us to have 1 honest node for sure.

As for timeouts, there is no point to count negative answers, since they might be provided by adversaries, hence at any rate you need to wait for q responses or wait for timeout to hit. At any rate there is no contradiction.

Sorry, small addition: BFT-Smart refers to BFT algorithm as a gray box

That's fine, but that's wasn't my point, BFT Smart is in a sense meta algorithm which operates with BFT algorithm primitive that satisfies given API.

C0rWin commented 5 years ago

@pfi79 did I answer your concerns?