shutter-network / rolling-shutter

Rolling Shutter is an MEV protection system to be plugged into rollups.
https://twitter.com/project_shutter/
26 stars 7 forks source link

Fix keyper's `handleOnChainKeyperSetChanges` not using current blocknumber #392

Closed ezdac closed 10 months ago

ezdac commented 10 months ago

The test change-keyper-set did only pass by accident because of the parameterization and the specific timeframe in which the keyper-set change is concluded compared to the program state. The on-chain keyper set change in the test e.g. has the following parameters:

This makes the check in handleOnChainKeyperSetChanges pass:

https://github.com/shutter-network/rolling-shutter/blob/34e728e6101dc8ffd87c54165c61793d3124541a/rolling-shutter/keyper/keyper.go#L238-L246

However if other values are chosen, the program will go into a livelock, because the last-seen block that makes the check pass, will never be updated again if no new batch-config becomes active:

https://github.com/shutter-network/rolling-shutter/blob/34e728e6101dc8ffd87c54165c61793d3124541a/rolling-shutter/keyper/keyper.go#L184-L204

This creates a circular dependency of all keypers not being able to send out newly seen batch-configs and starting the DKG for them, and thus no batch config becoming active anytime soon, updating the last seen block. This results in a livelock, where all new keyper-sets on l1 are ignored by the keypers.


2023/08/17 12:09:14.744049 INF [       keyper.go:166] handle on chain changes l1-block-number=207
2023/08/17 12:09:14.744846 INF [       keyper.go:257] not yet submitting config dkg-start-delta=10 keyper-set={"ActivationBlockNumber":66,"KeyperConfigIndex":1,"Keypers":["0x7D125eA18a7C56e8F5ab9834e8916E73A4Ae32F0","0x72318CFf1eC66D2F9f931090f2f9094930f60723","0x98B853d770b2F467B8e8AA6c06015Db12e973e94"],"Threshold":2} last-block-seen=20

One can see that the last block seen is stalling and thus way behind the current l1 block of 207 as well as behind the activation block number of the batch-config / keyper set change. The config has not been sent out yet, and thus not voted upon and activated etc.

The fix for this issue is to not use the last seen block communicated to the tendermint chain in order to trigger sending out new batch configs, but use the current l1 block number locally seen by the keyper.

ezdac commented 10 months ago

The relevant test is still failing, I am still investigating wether this is unrelated, or if the proposed fix is not conclusive.