Open ralexstokes opened 2 years ago
It's a little tricky cause it seems like it'd be a good job for mev-boost to cache and re-send messages. But if the VC can't get feedback from mev-boost about the state of registrations, then you have to trust that mev-boost received your messages when you sent them and didn't lose them.
In Lighthouse now, our VC caches and resends every epoch. If we also add that functionality to mev-boost as well, I don't really see a need to remove it from our VC because it doesn't add much load so long as you don't have to re-sign messages. You also don't have to worry about mev-boost losing your registrations (on a random reboot you don't notice or something).
Maybe the way to go would be add an endpoint to the builder API to check registration statuses, and if all's good, don't send, otherwise re-send. And validators could check as frequently or infrequently as they'd like. It'd also probably improve mev-boost <> relay communication.
Something to think about is if we spec out once per epoch registrations, will the relay be flooded with hundreds of thousands of registrations on the first slot of every epoch?
Yep, that's correct, and we are also just mapping out the infra to handle this. I don't think there's a good way around this, since we want relays coming online later to catch up with the latest registrations.
The service receiving the validator registrations needs to be able to handle large bursts of registrations, and validate them without impacting the other systems. The service will need a list of validators in memory to do a first validity check of the messages before verifying the signatures.
We're going to load test with 1M-10M of requests.
we could add some randomized jitter to when exactly in the epoch (e.g. add a random number of fractional slots from 0-SLOTS_PER_EPOCH) the registration is rebroadcast to avoid any hotspots like this
at the same time, it is likely in the best interest of a relay to be able to handle a high load
Relays need to be designed in a way that can handle the load. Some variance might be nice but couldn't be relied upon anyway.
The spec currently says validators should submit their registrations once per epoch.
However, it seems like the mev-boost component has the responsibility to keep registrations it learns about and upstream them to connected relays in which case the validator client may be posting data every ~6 mins for no additional gain.
I'm opening this issue to seek feedback on expectations around this process and if we can reduce the frequency I'm happy to change the spec to reflect that.
Based on my current understanding, the validator client only needs to call the registration API when they boot or if any of their registration data changes (which practically may be synonymous with boot if clients only allow changes to config during validator client process boot).