stacks-network / stacks-core

The Stacks blockchain implementation
https://docs.stacks.co
GNU General Public License v3.0
3.01k stars 667 forks source link

Multi-miner , Signer block rejection #5132

Closed saralab closed 1 week ago

saralab commented 4 weeks ago

Miner 2 proposes block #156

All miners see burn block #134

Miner 0 (correctly) builds from block 155 and proposes block #156 to the signers

This repeats forever

Discussion:

Miner doesn’t properly track whether a block rejection already came from a signer

saralab commented 4 weeks ago

@jferrant Will draft a proposal and review with the team.

jcnelson commented 4 weeks ago

I think we flubbed the logic where signers learn whether or not a block proposal was accepted or rejected by the network. The acceptance process need to look like this I think:

As before, a miner will continuously try to build atop one of the highest definitively accepted block, and continue to do so in the face of timeouts and rejections. The miner's p2p and relayer threads work in the background to sync with the signers' nodes to ensure that the miner has the same blocks (and same highest definitively accepted block) as the signers.

jcnelson commented 4 weeks ago

A block is instead "tentatively" (or "locally") accepted or rejected, until it becomes "definitively" (or "globally") accepted or rejected. A signer may treat multiple blocks at the same height as tentatively accepted or rejected.

This is something that I think needs some more elaboration. In the example above, signer 0 tentatively rejects the miner's first block because it saw burn block 134 before seing the proposal. Signers 1 and 2 tentatively accept, but because they don't receive a threshold of signatures (nevermind signer 0's rejection), their decisions aren't definitive.

Once all miners see burn block 134, the miner retries building the block. Signers 1 and 2 would now also tentatively accept the block. Once signer 0's acceptance is received, all signers definitively accept.


Here's an interesting question -- can we ensure that at most one block will be definitively accepted at a given block height? Because, what I've written above is insufficient -- it's possible that multiple tentatively accepted blocks at the same height can become definitively accepted.

We can change this to an "at most one" criteria if we're willing to add an extra round of communication:

In essence, this is a flavor of two-phase commit. We'd need the miner to instruct signers to commit to their validated blocks.

How might we achieve this? I think the answer is straightforward -- we just put the miner's round-2 signature into the block header.

EDIT: This assumes that the miner itself won't equivocate. I'll address that below.

jcnelson commented 4 weeks ago

What happens if the miner equivocates and signs and broadcasts two or more blocks at the same height in round 2? I think what must happen here at a minimum is signers that witness the equivocation will cease signing blocks from that miner for the rest of its tenure. Then, the next miner picks one of them and builds atop it.

This is straightforward to implement -- because the node will gladly process two Nakamoto blocks at the same height, it's easy to check to see if they were signed by the same miner. The node itself would track whether or not a tenure has two or more blocks signed at the same height, and report it to the signer.

If we're feeling adventurous, we could also slash the offending miner's coinbase. But, the above can ship after Nakamoto, since it's ultimately a signer policy choice to refuse to sign.

EDIT: If we're feeling less adventurous, but still want to make the miner suffer for its equivocation after Nakamoto ships, we can have the signer refuse to sign blocks originating from tenure block-commits coming from that equivocating miner, which forces the miner to re-register a VRF key and re-submit multiple block-commits before they can mine again.

jferrant commented 4 weeks ago

What happens if the miner equivocates and signs and broadcasts two or more blocks at the same height in round 2? I think what must happen here at a minimum is signers that witness the equivocation will cease signing blocks from that miner for the rest of its tenure. Then, the next miner picks one of them and builds atop it.

This is straightforward to implement -- because the node will gladly process two Nakamoto blocks at the same height, it's easy to check to see if they were signed by the same miner. The node itself would track whether or not a tenure has two or more blocks signed at the same height, and report it to the signer.

One question I have is, should the node even process two Nakamoto blocks at the same height? Is it better for the chainstate to reject it outright and not even process it or is there a valid reason why we would ever want to process two blocks at the same height?

jcnelson commented 4 weeks ago

One question I have is, should the node even process two Nakamoto blocks at the same height? Is it better for the chainstate to reject it outright and not even process it or is there a valid reason why we would ever want to process two blocks at the same height?

The Nakamoto chainstate DB must tolerate Nakamoto blocks at the same height because they can arise from Bitcoin forks.

jferrant commented 4 weeks ago

One question I have is, should the node even process two Nakamoto blocks at the same height? Is it better for the chainstate to reject it outright and not even process it or is there a valid reason why we would ever want to process two blocks at the same height?

The Nakamoto chainstate DB must tolerate Nakamoto blocks at the same height because they can arise from Bitcoin forks.

Ah of course. So additional logic would be required on the nodes side to ensure a miner was not punished for simply responding to a bitcoin fork. i.e. the burnchain consensus hash for the proposed block would have to be identical between the two proposed blocks signed by the same miner, yeah? EDIT: but this does make me wonder...in the case of a bitcoin fork...how would signers handle this? i.e. what if they signed a block built on a bitcoin fork. The subsequent stacks block...what should it look like and how should te singers handle it? I am sure we have some handling in place, but I don't think I have ever actually thought it through and am wondering how/if these changes would affect it.

obycode commented 4 weeks ago

Couldn't round 2 just be the miner mining the next block? It's already implicitly selecting the canonical block by mining the next block and proposing it, isn't it?

jcnelson commented 4 weeks ago

Yes, that's correct -- round 2 is a logically distinct round in the protocol, but in practice it can be (and is) piggybacked onto the next block's round 1 by way of building a block that acknowledges it as its parent.

Part of the point I'm trying to make is that we currently do not process (logical) round 2 correctly. Signers do not treat a miner's proposal for block N+1 as a "commit block N" message; instead, they eagerly and unconditionally commit blocks for which they observe a threshold of signatures and for which they have not yet witnessed a conflicting block N within a locally-determined timeout. Logically speaking, we need to make it so signers wait to accept block N until after they see a valid proposal for block N+1, since the proposal provides a signer-verifiable proof that the miner has acknowledged at least 70% of the signing power (i.e. a miner-signed header with a parent_block_id derived from a valid signature set). In our implementation, this would be achieved by a signer fork-choice rule -- even if signers' nodes eagerly and unconditionally process multiple blocks at height N regardless of whether or not they conflict, the signer would not treat a block at height N as part of any fork until it witnesses a valid proposal for block N+1.

jcnelson commented 4 weeks ago

Also, per a separate conversation with @jferrant, it's worth mentioning that the "at most one block at height N" rule only applies within a single Bitcoin fork. This is what SIP-021 calls for.

While one day it could be possible to have at-most-one block semantics globally, that would require dealing with the case where a tenure-change happens to land in a Bitcoin block that gets orphaned (which SIP-021 does not require us to do).

obycode commented 4 weeks ago

There is a case that we recently fixed in which the behavior would need to be changed:

  1. Miner proposes block N
  2. Signers sign block N, reaching the acceptance threshold
  3. A communication problem causes miner to time out waiting for signatures
  4. Miner proposes block N'

The current solution to this problem was that the signers can broadcast block N, as soon as they see that it has reached the acceptance threshold. The signers then reject the proposed N'. The miner eventually receives the signed block N via the network and then proposes block N+1.

With this proposal to solve this new problem, the signers would no longer broadcast block N but would instead accept block N'.

If this situation happens at the tenure boundary, then the next miner would have the option to build from N or N'.

jcnelson commented 4 weeks ago

With this proposal to solve this new problem, the signers would no longer broadcast block N but would instead accept block N'.

Signers should continue to store broadcast both N and N' if they have reached the signature threshold. However, singers do not believe that either N or N' are the chain tip until they see a valid proposal for N+1. That is, N and N' are "unconfirmed blocks." The proposal for N+1 confirms either N or N', and the other blocks that were not confirmed at height N will be treated as unconfirmed forever. If a miner submits two conflicting proposals for N+1 -- one that confirms N and one that confirms N', then signers that observe both proposals declare that the miner is malicious and refuse to sign any more blocks from it.

For example, here is a valid chain history under these rules. B[i] is a block, and i is the order in which it was produced.

N    B[0]
      |
      |-----.
      V     V
N+1   B[1]  B[2]
      |
      |
      V
N+2   B[3]
      |
      |-----.-----.-----.
      V     V     V     V
N+3   B[4]  B[5]  B[6]  B[7]
                  |
                  |
                  V
                 B[8]

The canonical chain is B[8] - B[6] - B[3] - B[1] - B[0]. The miner and signers are allowed to create sibling blocks, but once a sibling at height N is confirmed by a valid proposal, then no other blocks at that height can be built upon.

By contrast, here is an invalid history:

N     B[0]
       |
       |-----.
       V     V
N+1   B[1]  B[2]
       |     |
       |     X (invalid -- B[1] is confirmed)
       V     |
N+2   B[3]  B[5] (never processed)
       |
       |
       V
N+3   B[4]

Once the miner submits the proposal for B[5], the signers not only reject it, but also refuse to sign anything else the miner submits.

As before, the miner can produce as many blocks at height N as it needs to in order to build a block that has 70% signing power. But once the miner moves on, they cannot go back.


We can get to a place where we get at most one block produced at height N, but for now, it would suffice that we have at most one block accepted at height N. Most of the time, there won't be siblings.

jcnelson commented 4 weeks ago

Ah of course. So additional logic would be required on the nodes side to ensure a miner was not punished for simply responding to a bitcoin fork. i.e. the burnchain consensus hash for the proposed block would have to be identical between the two proposed blocks signed by the same miner, yeah? EDIT: but this does make me wonder...in the case of a bitcoin fork...how would signers handle this? i.e. what if they signed a block built on a bitcoin fork. The subsequent stacks block...what should it look like and how should te singers handle it? I am sure we have some handling in place, but I don't think I have ever actually thought it through and am wondering how/if these changes would affect it.

I don't think the node needs to be involved in punishment at all, unless we intend to slash their coinbase (I don't think this is necessary for Nakamoto; forcing the miner to rotate their Bitcoin keys is usually harsher). I think this is a decision that each signer makes locally based on whether or not they observed the miner equivocate. In my diagram above, the signers who see the proposal for B[5] after B[3] has been accepted would decide to punish the miner by refusing to sign any more blocks from it.

The node already tracks each Stacks fork atop each Bitcoin fork, so the signer can detect miner equivocation for block N+1 simply by asking the node for the list of processed block headers at height N+1. If they all have the same parent, then there's no equivocation. Otherwise, there is equivocation, and block N+1 should be rejected and the miner punished.

I think there needs to be an API endpoint for the above in the Stacks node, but I think that most of the work to make this all happen is changing the signer behavior.

kantai commented 3 weeks ago

I think that there's two separate issues here:

  1. The tolerances and timings for rejecting submissions should be updated to minimize the likelihood of this occurring -- this is ultimately a scenario that the network should try to avoid.
  2. As Jude discusses here, the signer logic for how it treats locally signed blocks needs to be updated with a "tentatively accepted" state and a "rejected state" (possibly a "tentatively rejected" state as well, but I think that's probably unnecessary).

(1) is theoretically easier to solve, but solving it doesn't mean (2) doesn't need to be solved.

Anyways, I think the strategy for 2 could be somewhat straight-forward.

Basically, the signer db tracks proposals in one of four states:

Proposed -- the proposal was received from the miner, has passed the initial set of checks and is waiting for a response from the stacks-node proposal evaluation endpoint. I think this is basically unchanged from the current implementation.

Rejected -- if the stacks-node has locally rejected the proposal, or (set-size) - (threshold) + 1 signers have rejected the proposal.

Tentatively accepted -- the stacks-node has locally accepted the proposal, and broadcasted a signature

Globally Accepted -- (threshold) signers have accepted the proposal.

Tentatively accepted transitions to Globally Accepted or Rejected if and only if the signer receives enough proposal responses from other signers to perform the transition (I think this is slightly different than Jude's proposal above, which transitions to rejected when the tenure changes: I'll discuss why in a moment).

These states are really important to the signer when it is evaluating subsequent proposals. I think the rules should be something like:

  1. If a block proposal is in the same tenure as a prior proposal, its height must be greater than the highest tentatively accepted block known to the signer. Until the tentatively accepted block is rejected by the signer set, the signer will not accept a sibling in the same tenure.
  2. If a block proposal is in a new tenure, its height must be greater than the highest globally accepted block.

Otherwise, I don't think the signer needs more complex logic. This guarantees the signer set never approves a sibling in the same tenure: a sibling would only ever be approved once a prior proposal is actually rejected (and an honest signer only ever responds ACCEPT or REJECT once for a proposal). It does mean that a given tenure could "stall" if there's not agreement in the signer set, but I think this is what should happen anyways. Siblings could occur across tenures, but that was already the case.

jcnelson commented 3 weeks ago

On the miner / node side of things, the following would need to change:

jferrant commented 3 weeks ago

Tentatively accepted transitions to Globally Accepted or Rejected if and only if the signer receives enough proposal responses from other signers to perform the transition (I think this is slightly different than Jude's proposal above, which transitions to rejected when the tenure changes: I'll discuss why in a moment).

Just to confirm, it is also possible for Rejected to transition to globally accepted? I assume a stacks node could have an outdated view and reject a block whereas all other signers approve it, yeah? If this is the case, I would introduce a tenativerejected to distinguish between the threshold signature rejection and the node marking it invalid.

kantai commented 3 weeks ago

Tentatively accepted transitions to Globally Accepted or Rejected if and only if the signer receives enough proposal responses from other signers to perform the transition (I think this is slightly different than Jude's proposal above, which transitions to rejected when the tenure changes: I'll discuss why in a moment).

Just to confirm, it is also possible for Rejected to transition to globally accepted? I assume a stacks node could have an outdated view and reject a block whereas all other signers approve it, yeah? If this is the case, I would introduce a tenativerejected to distinguish between the threshold signature rejection and the node marking it invalid.

This might help with debugability and its probably safer to do this to future proof the signer's logic, but I don't think this its strictly necessary. Because the checks that the signer is performing are all based on block height (and it performs "greater than" checks), the signer will just move on if the rest of the signer set ends up accepting the proposal.

jferrant commented 3 weeks ago

This might help with debugability and its probably safer to do this to future proof the signer's logic, but I don't think this its strictly necessary. Because the checks that the signer is performing are all based on block height (and it performs "greater than" checks), the signer will just move on if the rest of the signer set ends up accepting the proposal.

Ah this is true...see I was thinking that once a block proposal is marked as GloballyRejected it should NEVER transition to GloballyAccepted as this would indicate some sort of bug or some malicious behaviour as signers should never respond with different answers to a repeat block (however a LocallyRejected block could very much transition to a GloballyAccepted block). However, this could cause a stall so perhaps better to just allow this to potentially happen?

jferrant commented 3 weeks ago

TLDR for @saralab: Signers must continue to process block proposals and submit their acceptance and rejection signatures accordingly. However, the signer must be updated to recognize the difference between their local versus the global view of the network. They may only mark a block definitely accepted or rejected when they observe a global decision has been made, specifically that the threshold number of rejections or signatures have been reached. To prevent forks within a tenure, the signer set will never approve a sibling block within the same tenure by ensuring the block proposal builds atop the highest accepted block: a sibling would only ever be approved once a prior proposal is actually rejected. It does mean that a given tenure could "stall" if there is no agreement in the signer set and a miner’s tenure effectively ends as it can never propose a valid block. However, at the tenure boundary, the signer can utilize the last globally accepted block of the parent tenure to determine whether the proposed block is valid, preventing the stall from propagating into the next tenure. Therefore, siblings could occur across tenures, but this is expected and acceptable behaviour.

jferrant commented 3 weeks ago

On the signer side of things (Stolen from @kantai Primarily :P )

The signer would add the following block states to SignerDB:

Proposed– the proposal was received from the miner, has passed the initial set of checks and is waiting for a response from the stacks-node proposal evaluation endpoint. LocallyAccepted – the stacks-node has locally accepted the proposal, and broadcasted a signature, but does not yet have a (threshold) number of signatures confirming the block GloballyAccepted – (threshold) signers have accepted the proposal. LocallyRejected – if the stacks-node has locally rejected the proposal/signer has failed initial set of checks GloballyRejected(set-size) - (threshold) + 1 signers have rejected the proposal.

TentativelyAccepted and TentativelyRejected can both transition to GloballyAccepted or GloballyRejected if and only if the signer receives enough proposal responses from other signers to perform the transition. Once a block is marked as GloballyAccepted or GloballyRejected, no further transitions may occur.

Prior to querying the block validation endpoint a signer will evaluate a block with the following rules:

  1. If a block proposal is in the same tenure as a prior proposal, its height must be greater than the highest block *Accepted block known to the signer in that tenure.
  2. If a block proposal is in a new tenure, it’s height must be greater than the highest GloballyAccepted block in its parent's tenure.

NOTE: This change relies on the block proposal endpoint changes @jcnelson is handling.

kantai commented 3 weeks ago

My only comment on the above is that this:

If a block proposal is in the same tenure as a prior proposal, its height must be greater than the highest block Accepted block known to the signer. If a block proposal is in a new tenure, it’s height must be greater than the highest GloballyAccepted block.

Should be:

  1. If a block proposal is in the same tenure as a prior proposal, its height must be greater than the highest Accepted block known to the signer in that tenure.
  2. If a block proposal is in a new tenure, it’s height must be greater than the highest GloballyAccepted block in its parent tenure.