Open drewstone opened 2 years ago
Some other difficulties I'm facing relate to the Simulation
s. They take all parties up front and I'm not sure how to analyse the architecture as is when working with an environment where the parties are in fact distributed. How does one execute these protocols within a distributed context if the StateMachine
is privately executable? Are there examples for this somewhere? It seems all methods to proceed the system are private and executed by a Simulation
in all the examples I can find.
@drewstone disclaimer: I am not a developer of this library, but I used it a bit
.message_queue()
that returns a queue of messages that a state machine wants to send. You can loop over it if it is nonempty. Once all messages are sent you would need to .handle_incoming()
received ones;.proceed()
itself, because it may cause some unpleasing latency. You can call .wants_to_proceed()
and .proceed()
methods when you are ready to face it;I guess you can also keep track of #132, it may answer some of your questions soon.
Thanks, @Sky-Nik, you impressively covered the question! @drewstone Indeed, what you need in order to carry out the protocol is methods of StateMachine trait only. Workflow looks like:
.message_queue()
is not empty, empty it by sending queued messages.wants_to_proceed()
, then call .proceed()
(and be aware that this method may take a while to complete).handle_incoming()
.is_finished()
, retrieve result with .pick_output()
However, I would recommend to work with StateMachine methods directly only if you needed an incredibly precise control on protocol execution. Otherwise, just use AsyncProtocol that takes asynchronous channels of incoming and outgoing messages and carries out the protocol (see docs examples).
Btw, we work on major revision of round_based crate. It will follow completely different (I hope, simpler) design, specifically, StateMachine trait will be removed. It will be released in a few weeks, and GG20 will be updated to use it. We'll add usage examples and update README (#132)
Thanks, @Sky-Nik @survived for the heads up,
I am working with @drewstone. I have used StaeMachine
APIs for GG20
KeyGen
and implementation follow the steps mentioned above and I am able to generate the key for our use case.
However, I am seeing some strange behavior that state's
message_queue
keeps on increasing by one message in each round.
For t=2, n=3 settings, message_queue
length is 5 at final round,
{Keygen at round=2 msgs1=[None] msgs2=[0/2] msgs3=[0/2] msgs4=[0/2] queue=[len=2]}
.
.
.
{Keygen at round=[Final] msgs1=[None] msgs2=[None] msgs3=[None] msgs4=[None] queue=[len=5]}
The items in message_queue
are the protocol messages till that round, for example after final round, queue contains [KeyGenBroadcastMessage1, KeyGenDecommitMessage1, VerifiableSS, VerifiableSS, DLogProof]
.
So whenever a party wants to send messages to other parties, queue_messages
gives previous round messages to send, and receiving party gets nx(previous_round_messages) bogus messages handling which with handle_incoming()
fails with Critical error: ReceivedOutOfOrderMessage
which I am ignoring as of now.
I am wondering if I have to do anything else to consume queue_messages
. For now, I get an into_iter()
on queue_messages
and use move
while map
to encode the messages.
Any help is much appreciated. Thanks
message_queue
keeps on increasing by one message in each round.
Hi @dev0x1, do you delete messege from the queue after you sent it? State machine only adds messages to the queue, it doesn't delete old messages. "Out of order message" error might be caused by double sending the same message
Thanks @survived for your quick response. I was just over-cautious to only use StateMachine's
public APIs. Now I am clearing the queue after sending the messages and everything works fine. Maybe, in case StaeMachine
is going to be around for some time, we should document this somewhere or we provide an API in StateMachine
to clear the queue.
You're right, it's not obvious from documentation that messages need to be deleted from the queue. I updated .message_queue
method documentation
I hit a snag updating the webassembly demo to gg2020 using the state machine directly so I wrote a test case to interact with the state machine without any networking. I thought I would add it here in case anyone wants to see a working example using the state machine without AsyncProtocol
or Simulation
.
@survived, is it worth adding this test to the repository as an example for people that need to interact with the state machine(s) directly?
Also, (and maybe this should be a different issue), I noticed that during key generation that wants_to_proceed()
isn't matching my expectations. If you uncomment this line:
https://github.com/LavaMoat/gg2020-test/blob/main/src/lib.rs#L54
The test will fail. Am I misunderstanding the expected behavior or is this a bug?
The test will fail. Am I misunderstanding the expected behavior or is this a bug?
.wants_to_proceed()
indicates whether StateMachine
wants to do some expensive computation. handle_incoming
actually may jump to the next round if this operation is not expensive. The only intention of having .proceed()
is to be able to handle heavy computations in different way (e.g. AsyncProtocol
calls spawn_blocking
)
You correctly observed that round 1 of DKG is not expensive and thus doesn't require calling .proceed()
for people that need to interact with the state machine(s) directly?
@tmpfs, could you describe your case, why do you need to work with it directly? Asking because next version of round-based will have state machine removed, ie the only way to run the protocol will be by providing Stream
and Sink
for receiving and sending messages. I thought it's going to be easier to work with, but maybe I'm missing some use-cases?
@survived, because in webassembly with wasm-bindgen
it makes more sense to do the networking on the Javascript side. Whilst it is possible to make network connections in js_sys
it is complicated and would be especially awkward in the multi-threaded situation with multiple workers / instantiated WASM modules.
I would really appreciate it if we could keep the StateMachine
design to facilitate this use case 🙏
@tmpfs does web assembly prevents you from using async
code? I'm not saying you have to implement delivery logic within Stream
/Sink
, you can just use UnboundedReceiver
/UnboundedSender
and feed them messages from js-code. You won't need to manually drive StateMachine, just supply these channels and everything works just fine.
For background @survived, we are not using this library where we provide the Stream/Sink. Maybe we lack experience seeing how exactly it fits for our use case but we're integrating into an existing gossip based system.
We aren't using async/await directly here either. I'm not sure removing the StateMachine will make it easier for us to use this library, but we'll see once these changes you mention land I suppose.
@tmpfs does web assembly prevents you from using
async
code? I'm not saying you have to implement delivery logic withinStream
/Sink
, you can just useUnboundedReceiver
/UnboundedSender
and feed them messages from js-code. You won't need to manually drive StateMachine, just supply these channels and everything works just fine.
@survived, whilst I know that it is possible to use futures in webassembly I don't know exactly what it will look like as I have never tried.
However, I have some concerns about this approach:
1) I would need to embed an async runtime (tokio
, async-std
etc) into the web assembly module increasing the size of the WASM file.
2) I then need to convert the futures to promises to return values to the Javascript side (maybe using wasm-bindgen-futures).
3) I think adding the polling of the async runtime to what is effectively CPU-bound synchronous code adds unnecessary overhead and complexity.
Creating the Stream
/ Sink
abstraction is a worthwhile effort but please don't sacrifice the synchronous use case. Can you build the Stream
/ Sink
API as an optional and recommended way to use the API but build it on top of StateMachine
so we can still call StateMachine
manually?
@tmpfs, unfortunately, StateMachine
implementation is difficult to maintain. Any protocol implementation (e.g. gg20) requires a lot of non-trivial boilerplate, and it's simply difficult to deliver bugfixes, analyse the code, etc. New version of library allows to keep the code clean and clear, that's the whole point. But cost of it is tying to async.
Regarding your concerns:
smol
which should address your concern.Runtime::block_on
to build a blocking APITo sum up, I believe it should work for your case just at additional cost of increased binary size and some (I believe, minor) computational overhead. In return, implementation of gg20 will get simpler, cleaner, easier to maintain, which is important security-wise.
Thanks for the explanation and rationale @survived! I wonder is this new version aspirational or is there some code that is work in progress I could take a look at?
If there is a branch I would really like to take a look and see what the shape of the new API might look like. Thanks 🙏
There's round-based2
branch in the repo, but it's poorly documented at this point, though most of api is established. You can look at round-based-ing
, lib that integrates round-based
and ING's implementation of GG18: there're two structs Keygen
and Signing
, they can carry out DKG and threshold signing. API for multi-party-ecdsa
is going to be similar.
Feel free to ask questions
I'm looking to extend an already existing gossip based protocol w/ a DKG and trying to execute the state machine for gg20's Keygen first. It seems difficult with the existing architecture to do so given how tightly coupled things seem, starting with how most methods and struct fields are private.
The scenario is as follows.
gossip_engine
that I cangossip
messages from. This is all handled synchronously and the async logic is abstracted away.The goal and assumptions
proceed
API directly (even though it is currently private).Recommendations (which may be naive but want to share nonetheless)
StateMachine
(such asKeygen
) it would be great to have a method that returns the next message to be broadcasted over the network. This way I can always call something likeget_active_message
and continue to gossip this without worrying much.Keygen
and otherStateMachine
s as well as the methods forStateMachine
such asproceed.
Let me know if there's prior art or ideas I should consider to think through this properly. Cheers