chainflip-io / chainflip-backend

The Chainflip backend repo, including the Chainflip Node and CFE.
50 stars 14 forks source link

[RC 1.6.4] #5244

Closed dandanlen closed 1 month ago

dandanlen commented 1 month ago

Replaces #5243

albert-llimos commented 1 month ago

(referring to the change in the election consensus not requiring unanimity) While my initial reaction was thinking that we should do this, I'm not so convinced anymore. I think the reason Alastair implemented it with all_same is because the nonce mechanism is critical to the system and we really want to be totally sure that a nonce has changed before updating in the State Chain. Updating it with a wrong or old value would lead to an invalid transaction, which is extremely problematic in, for instance, a rotation transaction. The downside of course is that one node could then keep reporting the wrong node and prevent the node from being updated. While I don't like this downside, when it comes to nonces I'd always prefer to be on the side of safety and compromise "liveliness" over signing invalid transactions. For example, the current situation is a lot easier to recover from than if we had some invalid transactions, especially if we were to hit a rotation.

dandanlen commented 1 month ago

@albert-llimos

If we require unanimity then a single validator can prevent any nonce from being updated...

albert-llimos commented 1 month ago

If we require unanimity then a single validator can prevent any nonce from being updated...

I acknowledged that in my comment. My point was that there was most certainly a reason why Alastair implement it that way (safety) and it's worth considering if there is some edge case that he thought about in which we would want unanimity. If we can't think of one or we just don't think we require more safety, then that's fine.

dandanlen commented 1 month ago

If we require unanimity then a single validator can prevent any nonce from being updated...

I acknowledged that in my comment. My point was that there was most certainly a reason why Alastair implement it that way (safety) and it's worth considering if there is some edge case that he thought about in which we would want unanimity. If we can't think of one or we just don't think we require more safety, then that's fine.

Maybe Alastair made a mistake? I don't think it adds more safety - it introduces a DoS vector.

I think the concern is that the validators might agree on some 'old' nonce, but requiring unanimity doesn't fix that: once we accept that a majority of validators might be wrong, we have to accept that they might all be wrong. I think the ultimate solution would be something like requiring them to witness (slot, nonce) and to require both a super-majority consensus on the nonce and that the consensus slot number increases.

codecov[bot] commented 1 month ago

Codecov Report

Attention: Patch coverage is 0% with 9 lines in your changes missing coverage. Please review.

Project coverage is 70%. Comparing base (35b2bdb) to head (d836f68). Report is 3 commits behind head on release/1.6.

Files with missing lines Patch % Lines
...llets/cf-elections/src/electoral_systems/change.rs 0% 9 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## release/1.6 #5244 +/- ## =========================================== - Coverage 70% 70% -0% =========================================== Files 481 481 Lines 86261 86269 +8 Branches 86261 86269 +8 =========================================== - Hits 60427 60424 -3 - Misses 22545 22553 +8 - Partials 3289 3292 +3 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

albert-llimos commented 1 month ago

I think the concern is that the validators might agree on some 'old' nonce, but requiring unanimity doesn't fix that: once we accept that a majority of validators might be wrong, we have to accept that they might all be wrong. I think the ultimate solution would be something like requiring them to witness (slot, nonce) and to require both a super-majority consensus on the nonce and that the consensus slot number increases.

Makes sense. That was probably the concern, and I'd say it makes sense because in Solana it's not uncommon for nodes to fall behind, which is what your ultimate solution would address. But anyway yes, you're right, this PR change is good.