relab / gorums

Gorums simplify fault-tolerant quorum-based protocols
MIT License
138 stars 14 forks source link

feat: Create config regardless of connection failures #179

Closed aleksander-vedvik closed 8 months ago

aleksander-vedvik commented 8 months ago

A config can be created regardless of a connection to the nodes have failed or not. Instead the channel will try to connect (only once) to the node for each message.

The channel needs to know whether a connection has been established. If not, it will dial the node. Otherwise, it will try to create a stream to the node.

aleksander-vedvik commented 8 months ago

Have added a few tests and more documentation. There are now two ways in which the channel will try to reconnect to a node:

  1. If a connection has never been established: It will try to reconnect when sending a message. (see c.sendMsgs)
  2. If a connection has been successfully established: It will in addition to point 1 try to reconnect in the background by using a back off strategy. (see c.recvMsgs)

Had to add adjust when locking happened in c.reconnect() since sendMsgs and recvMsgs are running in two different goroutines. Now, it will only lock when creating the NodeStream.

Regarding the tests: The last test is supposed to represent a server first being offline, then online, and then offline again. To do this, it is necessary to use unexported methods on the channel, meaning it is not possible to use "black box testing". As a result, I created some custom proto types in tests/mock which is not dependent on gorums (to prevent cyclic imports). However, I suspect that gorums uses a custom encoding for the proto messages (placed in the init() function in each generated gorums proto file), causing an error when unmarshalling a non-gorums proto message. I might be wrong in this assumption, but for now none of the test are actually checking the response at the server-side because it fails to unmarshal the message.

EDIT: The error is coming from encoding.go:113