hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.43k stars 4.43k forks source link

wan federated multi-cloud kubernetes clusters - Error from server: cross-datacenter failover is only supported in the default partition #21683

Open vishnuhd opened 2 months ago

vishnuhd commented 2 months ago

Overview of the Issue

I am trying multi-cloud consul mesh wan federation with AWS EKS and Azure AKS, following the article here https://developer.hashicorp.com/terraform/tutorials/networking/multicloud-kubernetes.

Everything seems fine, the wan members are connected and able to access the services across the cloud.

Node                   Address          Status  Type    Build   Protocol  DC     Partition  Segment
consul-server-0.aws    10.0.1.38:8302   alive   server  1.16.0  2         aws    default    <all>
consul-server-0.azure  10.244.1.4:8302  alive   server  1.16.0  2         azure  default    <all>

However, when I try to deploy a service resolver for DC failover, it gives me the following error.

---
apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
  name: facts-api-backend
spec:
  connectTimeout: 5s
  failover:
    '*':
      datacenters: ['azure']
$ kubectl apply -f serviceresolver.yaml --context eks
Error from server: error when creating "serviceresolver.yaml": admission webhook "mutate-serviceresolver.consul.hashicorp.com" denied the request: serviceresolver.consul.hashicorp.com "facts-api-backend" is invalid: spec.failover[*].datacenters: Invalid value: []string{"azure"}: cross-datacenter failover is only supported in the default partition

The services example I am using is from this github repo - https://github.com/jacobmammoliti/consul-multicloud-demo


Reproduction Steps

Follow the article : https://developer.hashicorp.com/terraform/tutorials/networking/multicloud-kubernetes

Deploy the example services : https://github.com/jacobmammoliti/consul-multicloud-demo/tree/main/kubernetes

Apply the service-resolver.yaml - https://github.com/jacobmammoliti/consul-multicloud-demo/blob/main/kubernetes/serviceresolver.yaml

Consul info for both Client and Server

Server info ``` agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = 192df66a version = 1.16.0 version_metadata = consul: acl = disabled bootstrap = true known_datacenters = 2 leader = true leader_addr = 10.0.1.38:8300 server = true raft: applied_index = 259 commit_index = 259 fsm_pending = 0 last_contact = 0 last_log_index = 259 last_log_term = 2 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:6580154e-6c92-837b-adfe-f294261f2a57 Address:10.0.1.38:8300}] latest_configuration_index = 0 num_peers = 0 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 2 runtime: arch = amd64 cpu_count = 2 goroutines = 259 max_procs = 2 os = linux version = go1.20.4 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 1 event_time = 2 failed = 0 health_score = 0 intent_queue = 1 left = 0 member_time = 2 members = 1 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 2 members = 2 query_queue = 0 query_time = 1 ```

I don't understand this issue and couldn't find anything related to this, please help to check if this is a bug or a config issue.

Hillkorn commented 6 hours ago

Okay there docs the example are misleading but I was wondering about the targets config that seems to offer the same. Tested this one and it actually works.

apiVersion: consul.hashicorp.com/v1alpha1
kind: ServiceResolver
metadata:
  name: facts-api-backend
spec:
  connectTimeout: 5s
  failover:
    '*':
      targets:
        - datacenter: 'azure'

If you have more than one datacenter to fail over to you have to add another entry like - datacenter: 'azure'-2 ,