hashicorp / nomad

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications. Nomad is easy to operate and scale and has native Consul and Vault integrations.
https://www.nomadproject.io/
Other
14.94k stars 1.96k forks source link

Servers may discover and join clusters in other Consul DCs #10780

Open notnoop opened 3 years ago

notnoop commented 3 years ago

When starting a new cluster, nomad servers can rely on Consul for service discovery to avoid requiring operators to set ip addresses. If Consul is federating multiple clusters, Nomad queries the local DC first; if none is found, it queries other Consul DCs in a random order. This behavior can be very surprising and lead to unexpected isolated clusters to join.

Note: Consul datacenter typically represents a single cloud region, and maps closer to Nomad's region concept.

Consider a user that runs Consul and Nomad installation in Ohio, us-east-2, and plans to expand to Mumbai, ap-south-1, and setups Consul federation. When Mumbai's first Nomad servers starts up, it discovers Ohio cluster and joins them! Other Mumbai's servers will in sequence discover their peer and again join Ohio cluster. The newly created Ohio-Mumbai raft membership will have expanded quorum and suffer long latencies. Splitting the resulting cluster in half may result in loss of quorum and service disruption.

This is extremely surprising! The logic dates back to the original PR in PR 1276, and provides no context on the choice.

The behavior is less pronounced in "real" life, though none is perfect in preventing the issue: Nomad uses local/private network address to communicate so cross-region packets get dropped, customers having network/firewall isolation blocking Nomad admin ports, Nomad clusters using TLS certs created by other regions.

Next Steps

Nomad should consider changing the logic so Nomad only queries local DC. The change is not backward compatible, though I'd very surprised if anyone relied on this behavior. We would welcome user info here.

Nomad users should consider specifying unique Nomad's region names (e.g. instead of global default), or specifying unique Consul server_service_name for each Nomad cluster.

akamensky commented 8 months ago

I'd very surprised if anyone relied on this behavior

Our systems are relying on this exact behavior. Which seems to become broken once the TLS is enabled (which is a different story, but I found this issue searching for any bug reports or solutions to this).