Open scottf opened 2 years ago
Is this better managed out of band? There are lots of considerations here- upstream health checking, resolving DNS names at a specified interval, failing over, failing back, etc.
It would be fairly straightforward to implement using Envoy with priority levels:
https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/priority
imoI for the NATS clients, simpler is better, and for some clients we could add a callback that's invoked to get the next url for customized server selection in connect/reconnect. Longer term we've discussed a high level service/stream API that can could be much more sophisticated with the features @caleblloyd suggested.
I am very keen on something like the callback Colin mentions.
For me the problem is I get initial list from elsewhere - SRV records, consul etc - and people might want move my clients to another cluster. So they update eg. SRV records but there is no way to rerun a query or update a running client.
I need to periodically or, less ideal, on reconnect be able to update server list on very long running clients I do not directly control.
If anything this should be callback that replaces the cluster gossip behaviour. That means that if you want to specify a callback the expected behaviour is that cluster updates are ignored (as the authoritative server list is now the responsibility of the callback)- the obtaining of the list could be an expensive situation, and in some cases possibly affected by the same network outage that is requiring the services to use a different cluster.
+1 for callback. Will allow custom implementations, including for thing like specific dns resolution
Not a feature
@marthaCP why is this reopened?
I meant to update the title for the issue. Scott said it was still open. Maybe we should discuss at the call tomorrow (11/9/22).
Overview
OPTIONALLY.... Provide some mechanism for the user to override providing the list of urls used for connecting or reconnecting to servers.
The user would provide a server url list or some way to iterate the list that matches how the client currently goes through the possible list.
As examples, the Java and .NET client refactored their own specific server list handling into an interface, a default implementation, and then provided a way in the Options for the user to provide their own implementation.
Parity Notes
This is not strictly required for parity. It's a nice to have, so can wait until a customer / user asks for it.
Clients and Tools
Other Tasks
Client authors please update with your progress. If you open issues in your own repositories as a result of this request, please link them to this one by pasting the issue URL in a comment or main issue description.
Original Text
Provide the ability to bootstrap the client connection with multiple lists of servers, representative of different regions. This would be useful in the case where clusters are deployed in multiple regions and clients would prefer to connect to the closest region (first list) always unless it fails on all servers in that list / server info at which time it would try from the second list of servers unless it fails all those and then would go to the third list.
For example, consider 3 regions east, central and west. E East Server List [a.b.x.1, a.b.x.2, a.b.x.3] C Central [a.b.y.1,a.b.y.2,a.b.y.3] W West[a.b.z.1,a.b.z.2,a.b.z.3]
The east clients would be configured with these 3 lists in the order of E, C, W but a west client would be configured in W,C,E order.
When connecting, the client would exhaust the first list before trying any in the second list.