Closed AlastairHolmes closed 2 years ago
Log extract which shows the bug in action:
I'm assuming @j4m1ef0rd can work on this since #1970 makes it non-urgent?
@dandanlen We want to solve this by requesting the signing ceremony used to check the key after a keygen from the state-chain. How possible is that?
It will require a bit of refactoring - currently all multisig requests assume we only need a 2/3 threshold of signers, so we would need to pick apart the signer selection from the signing. Apart from that it just implies an extra stage in the keygen process. Neither of these should be too difficult. (famous last words)
What's the priority on this? Is it still causing issues on testnet deployments? (@morelazers)
Also does it really fix the case where we have some race condition and two ceremonies are started at the same block, and different nodes start/see the ceremonies in different orders? Do we really have to be so strict with the ceremony ranges?
The race condition you describe isn't a problem, as long as the request events are processed in ascending ceremony_id order on each node (Which is what we do, unless the SC is outputting the events in a different order?).
The race condition you describe isn't a problem, as long as the request events are processed in ascending ceremony_id order on each node (Which is what we do, unless the SC is outputting the events in a different order?).
No, SC emits the events in order of ascending ceremony_id. I was wondering mainly if it's possible for the multisig messages to be received out-of-order but I suppose not (as long as we use TCP and not QUIC for per-ceremony connections hah).
Anyway, the question remains - is the issue not fixed simply by being a bit less strict about the ceremony id lower bound? The only reason for being strict is spam prevention, it's not a requirement of the protocol.
What's the priority on this? Is it still causing issues on testnet deployments? (@morelazers)
Negative, since disabling the check we've had no issues.
FWIW I think enshrining the keygen_success -> signing_request round trip via the State Chain is OK, it's just something that we really should be able to shortcut in theory.
I was wondering mainly if it's possible for the multisig messages to be received out-of-order
It doesn't actually matter if the messages come out of order. What's important is the the requests come in order. Once we have a request for a ceremony, we don't check the ceremony id at all. (It is not uncommon for p2p messages to arrive before we have the request.)
I've logged an issue for this, tried to add as much detail as I could think of for now. It might be a bit more involved than I first thought, but some of the changes (ie. the required changes to the threshold signer pallet) will likely come in useful elsewhere.
@morelazers I'll let you slap a priority label on it.
I'm interested to know whether this
Alternatively the signing ceremony could be authorised when the keygen gets to the point where the key is known.
Is possible. It seems like it would be less involved?
If we can't solve this problem purely on the CFE, then the linked issue would become a P1, since it might be painful to change this post-launch.
I'm interested to know whether this
Alternatively the signing ceremony could be authorised when the keygen gets to the point where the key is known.
Is possible. It seems like it would be less involved?
It is possible (and less involved it seems) but it will be hack. Doing this on the SC seems to be the right solution. Note that this would also address #1980 as I described in the comment.
Of couse I'm still open to doing it on the CFE if we run into challenges in the SC implementation.
Now that we have keygen verification triggered from the SC, what does complete look like for this issue? Reinstating the ceremony id tracking that was temp removed? @j4m1ef0rd @msgmaxim @AlastairHolmes
Reinstating the ceremony id tracking that was temp removed?
Sounds good to me. I can add it back in.
what does complete look like for this issue?
I would say this can be closed as long as there is an issue to track re-adding ceremony id tracking.
Description
The reuse of ceremony ids causes messages for the ceremony with a reused id to be rejected.
Just changing the cmp to >= is not enough as other ceremonies may occur between the keygen and thr associated signing ceremony. Ideally the keygen authorisation should authorise the (key validation) signing ceremony at the same time it is authorised. (Maybe, I want to think about this a bit more).
Alternatively the signing ceremony could be authorised when the keygen gets to the point where the key is known.
These are just my very early thoughts, I want to think about this more soon.
Alternatives Considered
Additional context
Relevant people
@dandanlen @msgmaxim