EventStore / EventStore-Client-Go

Go Client for Event Store version 20 and above.
Apache License 2.0
103 stars 25 forks source link

Respect MaxDiscoverAttempts #122

Closed hayley-jean closed 2 years ago

hayley-jean commented 2 years ago

Fixed: Respect MaxDiscoverAttempts

Currently, when attempting to discover a cluster, the discovery is only run once regardless of what the MaxDiscoverAttempts setting is configured to.

You can test this behaviour by:

  1. Don't start any nodes and attempt to connect to a cluster and append an event:
$> go run main.go
[info] discovery attempt 1/10
[warn] error when reading gossip from candidate localhost:3113: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:3113: connectex: No connection could be made because the target machine actively refused it."
[warn] error when reading gossip from candidate localhost:2113: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:2113: connectex: No connection could be made because the target machine actively refused it."
[warn] error when reading gossip from candidate localhost:1113: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:1113: connectex: No connection could be made because the target machine actively refused it."
[error] unexpected exception: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:1113: connectex: No connection could be made because the target machine actively refused it."
Error when appending event: could not construct append operation. Reason: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:1113: connectex: No connection could be made because the target machine actively refused it."
$>
  1. Start only 1 node in the cluster, and attempt to connect to the cluster and append an event:
$> go run main.go
2022/05/19 14:46:11 [info] discovery attempt 1/10
2022/05/19 14:46:11 [debug] trying candidate 'localhost:2113'...
2022/05/19 14:46:13 [warn] error when reading gossip from candidate localhost:2113: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:2113: connectex: No connection could be made because the target machine actively refused it."
2022/05/19 14:46:13 [debug] trying candidate 'localhost:3113'...
2022/05/19 14:46:16 [warn] error when reading gossip from candidate localhost:3113: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp [::1]:3113: connectex: No connection could be made because the target machine actively refused it."
2022/05/19 14:46:16 [debug] trying candidate 'localhost:1113'...
2022/05/19 14:46:16 [warn] error when picking best candidate out of localhost:1113 gossip response: no nodes are eligable to be a candidate
2022/05/19 14:46:16 [error] unexpected exception: rpc error: code = Unavailable desc = Server Is Not Ready
Error when appending event: rpc error: code = Unavailable desc = Server Is Not Ready
$> 

This happens because the connection is set even if an error occurs when establishing it, so the discovery loop breaks when the connection is checked for nil at the end.

This PR changes the discovery loop so that it returns instead when a connection is successfully established.