Consensys / handel

Multi-Signature Aggregation in a Large Byzantine Committees
Apache License 2.0
52 stars 5 forks source link

Libp2p - weird behaviors #82

Open nikkolasg opened 5 years ago

nikkolasg commented 5 years ago

We now have a comparative baseline simulation using libp2p where each peer connects to a few other peers (designated as a parameter "Count" in a config file), subscribe to the "handel" topic, broadcast their signature and wait to receive enough signatures. Unfortunately, this simulation exhibits weird behaviors (~failures) of the libp2p pubsub library. We can tests these failures in two different ways, in the fail_libp2p branch:

  1. Running the test TestGossipMeshy in simul/p2p/libp2p which is directly inspired from the tests found in the libp2p/pubsub repo.
  2. Running the simulation in simul/ with go run main.go -config config_gossip.toml -platform localhost - It's the generalization of the tests. Even with a large number of connected peers, the simulation fails most often.

Please note that sometime theses tests pass, but most often they don't - repeat the experience !

For the test, using a Neighbor connector that makes each peer connects only to some "neighbors" in the ID space (modulo), so all peer's connections form a circle - it's a completely connected graph. On the contrary, using Random connector that randomly connects peers (as in the libp2p pubsub's tests) fails most of the time.

whyrusleeping commented 5 years ago

Thanks for the link @nikkolasg.

@vyzo mind taking a look here and seeing whats up?

vyzo commented 5 years ago

sure

vyzo commented 5 years ago

Is the network generated by RandomConnector actually connected? That would explain the failures.

vyzo commented 5 years ago

I also noticed another thing: you only wait for 1s, with the heartbeat at 700ms. You should wait at least 4-5 heartbeats as message will propagate via gossip in the of a not fully connected mesh.