Closed ppca closed 1 month ago
success
success
Pusher: @ppca, Action: pull_request
, Working Directory: `, Workflow:
Terraform Feature Env`
URL: https://mpc-recovery-leader-dev-560-7tk2cmmtcq-ue.a.run.app
I've checked the logs and there are no messages (
no new messages received
) without signature requests. There is no infinite loop of messages, but maybe they stay in the loops for a long time.We should choose one approach here and use it in triple, presignature, and signature generation protocols since the problematics is the same.
The message is could not initiate non-introduced presignature: triple might not have completed for this node yet
or TripleIsMissing
. I think this happens when a presiganture generation involves triples that this node does not have.
So for the most part, we're seeing these logs when we're waiting for triples to get generated, and mostly occurs when we don't have a stockpile of triples yet, when the presignature manager is waiting for specific triples. So we should be distinguishing between
TripleIsMissing
andTripleIsGenerating
sort of errors
The case that's more problematic is when a node does not have a triple, and not going to generate it (for example, if somehow datastore did not persist triples successfully or the node goes offline at an unfortunate time), but it will always be processing the msg that asks the node to generate a presignature with that triple. Such msg will stay in the queue forever.
TripleIsGenerating makes sense. We could differentiate between the two errors, and add timestamp to the messages, and skip a message if it already timed out.
We should choose one approach here and use it in triple, presignature, and signature generation protocols since the problematics is the same.
Agree we should use the same approach.
removed requested altogether. different between TripleIsGenerating and TripleIsMissing, and skip presignature and signature messages that were sent more than 1 min ago.
hmmm I realized adding the timestamp as required to messages may make it non-backward compatible. It should be an optional field instead. Will change.
@ppca currently we are updating all the nodes at the same time. We better do it as required now than dealing with Optional later.
@ppca currently we are updating all the nodes at the same time. We better do it as required now than dealing with Optional later.
Problem is partner testnet is not....they update within an hour but not at the same time
Ok, but let's delete Optional after the update.
success
success
Pusher: @ppca, Action: pull_request
, Working Directory: `, Workflow:
Terraform Feature Env (Destroy)`
Currently, we would push presignature message to leftover messages if not both triples are found. This means we will likely process these messages over and over and never get rid of them.
This change will time such message out if > 1 min (presignature timeout threshold), and this message won't be retained in the queue.