Closed lumtis closed 1 year ago
Using a height here is to coordinate the keygen so that every client broadcasts the transaction simultaneously. If we modify it to >=
even though it can enter the keygen loop, the keygen itself will fail.
Curious, though, how often are you getting the error?
Ok, I think it's fine the issue arise around 20% of the time. So maybe just make sure the smoke test fails when tss is not generated is fine.
I'm still wondering why the gneeration couldn't happen asynchronously? Do we need all txs in the same block?
TSS keygen is an interactive "ceremony", this is why they are often called Keygen/Keysign ceremony. It's a pretty heavy MPC computation. The participants need to on line at roughly the same time otherwise it will fail. This is why all zetaclients need to synchronized to a certain block.
In your case, why would zetaclientd miss that exact block? Timing or slow computer?
Ok, I think it's fine the issue arise around 20% of the time. So maybe just make sure the smoke test fails when tss is not generated is fine.
I'm still wondering why the gneeration couldn't happen asynchronously? Do we need all txs in the same block?
Thanks for the explanation.
Somehow I can't reproduce the issue anymore. I will close it for now, eventually reopen if it occurs again.
It happens to me that the smoke tests are stuck in the initialization phase because the TSS is never created https://github.com/zeta-chain/node/blob/6b860efb1cd378328775113fb572a72271918270/contrib/localnet/orchestrator/smoketest/main.go#L137
The reason for this is that for some reasons the ZetaClient container restarts and misses the block height for keygen generation:
Considered solution
It appears to me that it is not necessary for ZetaClient to be exactly at
cfg.Keygen.BlockNumber
to generate TSS but to be at least at this block. In this case, the solution would be to replace the condition at: https://github.com/zeta-chain/node/blob/6b860efb1cd378328775113fb572a72271918270/cmd/zetaclientd/keygen_tss.go#L65to
I think we should also put a max try value at: https://github.com/zeta-chain/node/blob/6b860efb1cd378328775113fb572a72271918270/contrib/localnet/orchestrator/smoketest/main.go#L137 to make the smoke test stopping with a failure instead of having the initialization in an infinite loop