Currently, triples are being generated pretty slowly. With my local test, we are taking approximately 1 second for 1 triple being generated among 3 nodes. It seems to me like there's a lot of messages that is being sent and can potentially be slimmed down but requires going through cait-sith to do so.
So in the meantime, I suggest splitting the triple generation out of the cryptography and message handling portion of the protocol loop. Since each triple generation is entirely separate from each other triple generation, we can have each of these be a singular task in our runtime execution. So a singular triple protocol maps to a singular task (with async/task cancellation semantics being available to us via the task handler in case the protocol were to error out or take too long if needed). Note that this approach would mean that we might need to maintain a communication channel for each new task, which shouldn't be too bad for the boost in performance we get.
This approach should be heavily benchmarked/profiled against our current execution model just in case this isn't giving the returns we'd expected it.
Currently, triples are being generated pretty slowly. With my local test, we are taking approximately 1 second for 1 triple being generated among 3 nodes. It seems to me like there's a lot of messages that is being sent and can potentially be slimmed down but requires going through cait-sith to do so.
So in the meantime, I suggest splitting the triple generation out of the cryptography and message handling portion of the protocol loop. Since each triple generation is entirely separate from each other triple generation, we can have each of these be a singular task in our runtime execution. So a singular triple protocol maps to a singular task (with async/task cancellation semantics being available to us via the task handler in case the protocol were to error out or take too long if needed). Note that this approach would mean that we might need to maintain a communication channel for each new task, which shouldn't be too bad for the boost in performance we get.
This approach should be heavily benchmarked/profiled against our current execution model just in case this isn't giving the returns we'd expected it.