ssvlabs / ssv

Secret-Shared-Validator(SSV) for ethereum staking
https://ssv.network
GNU General Public License v3.0
182 stars 95 forks source link

discussion: ways to improve Ethereum block proposer duty flow #1829

Open iurii-ssv opened 3 days ago

iurii-ssv commented 3 days ago

I'll post the findings from recent Discord discussion(s) here - so we can have it documented to revisit later (cause it seems important),

The problem

1) It seems relays are blocking any proposals past 4s mark of the current slot. Would it be possible to add a bypass mechanism after 4s, so that a non-MEV block is submitted? (instead of a complete miss)

2) Iurii mentioned our 1 and 2 rounds of proposal duty allocate 4s unevenly / sub-optimally. It might make sense to review this mechanism and timings.

3) another point that's not 100% clear to me is why we are starting proposer duty exactly at the start of targeted slot, I understand the currently written code works this way - but is there some fundamental limitation (perhaps DVT-related) to do it like that (answering myself: probably not) ?

from this article - https://www.blocknative.com/blog/anatomy-of-a-slot - it seems the start of targeted slot is the time where most blocks already get proposed (so they have enough time to spread through Ethereum network)

4s since slot start (call it "soft limit") also seems to be quite late/risky (otherwise the chart below would look different I think), we might want to limit it to 2.5s instead of example - and that will render round 2 useless/unnecessary btw

Before t = 0 there are safe proposals that ensure propagation happens and so you don't miss out on the slot. There are also a lot of actors that wait until very late, 2.5 seconds in, who are taking their chances with propagation delays. So that's the general distribution of block availability over time relative to that slot boundary. You can call it the t = 0 or t=12.

image

Potential solution(s)

There is trade-off between picking/broadcasting the proposed block sooner/later:

I'll record another round of thoughts I have here on these 1-3 findings from above, I think we can do something like this:

Will this work, or am I missing something ? If something like this could work, I guess it's not easy to implement straight away but we can slowly progress towards it. It seems like an important problem to solve. Regarding additional resource (cpu/mem/...) consumption, sure it will cost some - but I think we won't have to query Beacon node for 2 blocks at the same time much (if at all), and it seems like block proposals are quite rare to matter in that respect anyway ?

regarding external factors (beacon node, relay):

@Iurii I think what might be a problem is that an operator doesn't know if others successfully submitted. If they have then it will lead to slashing. And I think there's no reliable way to find this out unless we add another consensus layer on whether they submitted

are we talking about the slashable Ethereum offense known as double signing ?

I believe for block proposal (for attestations it's similar but somewhat different) it means - Ethereum can punish validator if he has signed 2 different blocks (headers) and both of these blocks were observed by somebody (I think it might be called Watch-tower or fisher maybe), and that somebody created a proof of that and sent it out to Ethereum nodes to verify

so, if that's how Ethereum slashing for block production works - in the approach I outlined above we actually never sign 2 different blocks, only 1 block will ever be signed (ofc we'll probably need to add/adjust some logic that SSV node only ever signs 1 block at post-consesus phase even though it might have 2 blocks at hand after finishing 2 qbft consensuses prior to that - thus, post-consensus quorum needed to reconstruct validator signature can only be reached for at most 1 of the 2 blocks every SSV node prepared for target slot)

and as for who/how signed block is submitted to Ethereum network, I believe there isn't any issues with broadcasting such block from multiple different Beacon nodes at the same time (or different times) - in fact it is probably better if we can do multiple such broadcasts because then this block will reach all Ethereum nodes sooner (for Ethereum validators to be able attest to it).

iurii-ssv commented 2 days ago

Another note, profitable block QBFT (call it 2nd QFBT, or QFBT 2) should probably have exactly 1 round (not 2 or even more) because if not - that would mean we are not as profitable as we could be.

Alternative solution (pushing the approach from above to its limits)

To take a step back to explain why do 2 QBFT consensuses, ideally when proposing Ethereum blocks we want to have these 2 properties: 1) "proposing operator" rotation (important for preventing any particular operator from conducting funny business, like selling his privilege "to convince validator cluster to propose certain block(s), or kinds of blocks" through a side-channel); note, this is different from QBFT Instance leader rotation that exists to ensure QBFT liveness only, "proposing operator" rotation relies on it kind of deriving this desired property 2) every operator wants to know whether he needs to sign profitable block or backup block to successfully finish the duty - this is why we do 2nd QBFT (on profitable block in the algo described above) so that upon commit-quorum every operator knows what other operator(s) gonna do (honest operators, to be precise) and can broadcast the correct block of the 2 for BLS-signing

property 1) is nice to have, but perhaps can be somewhat relaxed if we want to get the most out of MEV; mentioning just for completeness (for further considerations we might have on it), stuff described below doesn't compromise on it

property 2) however is a binary thing (either we have it, or we don't) and not having it means operators will miss proposing Ethereum block if they couldn't correctly guess & agree on which of backup / profitable to propose; but (again, to get the most out of MEV) we might consider an approach where we do QBFT 1 consensus on backup block first (like in the approach above), but then we throw away backup block, take a gamble and propose profitable block every time without doing full QBFT 2 consensus on it, but rather just hoping that profitable block will be spread out to enough operators for them to sign with their validator share - this seems to make sense to do for 2 reasons:

we don't really need backup block at all (hence we throw it away) - what we really need is to "check networking conditions right before we are about to propose profitable block" (and select the leader who is best suited to do that - the leader proven by QBFT 1 consensus).

This approach relies on a couple of hypothesis that need to be verified against real-world cluster data (preferably production data that maybe we can gather by implementing a "dry-run" version of this approach on prod cluster(s)) - even though this approach seems risker (compared to approach with backup block there will be more missed blocks most likely) - it might yield higher expected reward to Ethereum validator(s) over time (because of MEV).