Open tbjump opened 2 years ago
Thanks for putting together this table!
The downside is that during a hypothetical emergency where 13 or more guardian keys have been compromised, it would not be possible to rotate to a new guardian set immediately.
If 13 or more guardian keys have been compromised, they can be used to override whatever safety mechanism we put in here, so I don't think any contract-level decision should be made under the assumption that a supermajority is compromised.
Therefore, benefit (1) can also be accomplished operationally by taking the guardians that are to be removed from the guardian set offline some time before implementing to guardian set upgrade.
This is true, but depending on the frequency of guardian set upgrades, might not be desirable.
If 13 or more guardian keys have been compromised, they can be used to override whatever safety mechanism we put in here, so I don't think any contract-level decision should be made under the assumption that a supermajority is compromised.
In my experience doing incident response there is often a time delay between an adversary acquiring a key and them actually using it, giving incident responders a window of opportunity. There are also frequent scenarios where there is some indication that a key may have been compromised, but no certainty about it. In both cases it's valuable to have this distinction.
Guardian Set expiration is treated differently on some chains and is also sometimes treated differently on the core bridge vs. Portal Token Bridge, a prominent xApp.
Expected Behavior
The behavior should be the same across all chains.
Details
This table shows the differences:
Discussion: What should the behavior be?
The benefit of not immediately expiring a guardian set for normal messages is that VAAs created during the guardian set change can still be verified for some time after the new guardian set becomes effective. The benefit of not immediately expiring a guardian set for governance messages is that if there was an error in the new governance set, there is some time to recover from it.
The downside is that during a hypothetical emergency where 13 or more guardian keys have been compromised, it would not be possible to rotate to a new guardian set immediately.
In practice, I don't expect more than 6 guardians to change at a time, except in an emergency situation. Therefore, benefit (1) can also be accomplished operationally by taking the guardians that are to be removed from the guardian set offline some time before implementing to guardian set upgrade. VAAs generated during that time will still be valid with the new guardian set. Benefit (2) is negligible since governance actions are heavily tested.
I therefore propose to remove the concept of guardian set expiration and expire all old guardian sets immediately once there is a new guardian set.
This will also make it easy for xApps since they don't need to worry about the different threats and decide which guardian sets to trust or not to trust.
This will also make it easier to reason about the CosmWasm shutdown switch #1598
Looking forward to your thoughts!