Closed vjwilson1987 closed 1 year ago
I would appreciate any help on this. We are currently stuck with this at the POC / initial setup part. If the issue takes a long time to solve, there is a high chance the management will skip the consul and move to some alternatives. Kindly help.
ok, to update here, I was able to figure out the issue.
If you follow the current documentation https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s as it is, it will only create the consul UI service on NodePort
but not the rest of the services (including for grpc). By default, from the helm chart, they are created as headless services by setting clusterIP None. To expose those services as well, you have to add the following block of values in cluster1-values.yaml:
server:
exposeService:
enabled: true
type: NodePort
nodePort:
http: 30010
https: 30011
serf: 30012
rpc: 30013
grpc: 30014
So the final cluster1-values.yaml will look like:
β cat cluster1-values.yaml
global:
datacenter: dc1
tls:
enabled: true
enableAutoEncrypt: true
acls:
manageSystemACLs: true
gossipEncryption:
secretName: consul-gossip-encryption-key
secretKey: key
syncCatalog:
enabled: true
server:
exposeService:
enabled: true
type: NodePort
nodePort:
http: 30010
https: 30011
serf: 30012
rpc: 30013
grpc: 30014
ui:
service:
type: NodePort
Then, you need to make sure you submit this grpcPort: 30014
on file cluster2-values.yaml to ensure cluster2 can connect to the correct grpc NodePort of cluster1.
So the final cluster2-values.yaml will look like:
β cat cluster2-values.yaml
global:
enabled: false
datacenter: dc1
acls:
manageSystemACLs: true
bootstrapToken:
secretName: server-consul-bootstrap-acl-token
secretKey: token
tls:
enabled: true
caCert:
secretName: server-consul-ca-cert
secretKey: tls.crt
externalServers:
enabled: true
# This should be any node IP of the first k8s cluster or the load balancer IP if using LoadBalancer service type for the UI.
hosts: ["172.26.1.58"]
# The node port of the UI's NodePort service or the load balancer port.
httpsPort: 31256
tlsServerName: server.dc1.consul
# The GRPC port of the Consul servers.
grpcPort: 30014
# The address of the kube API server of this Kubernetes cluster
k8sAuthMethodHost: https://api.kops-consul-agent.domain.com:443
connectInject:
enabled: true
After redeploying with the required values on both clusters, everything started working fine and the cluster2 consul could join the consul server running on cluster1 fine.
(β|K8-Consul-Client)Ξ ~ β kubectl -n consul logs consul-consul-connect-injector-69d44f4cb6-hsbr9
2023-02-23T06:49:10.177Z [INFO] consul-server-connection-manager: trying to connect to a Consul server
2023-02-23T06:49:10.177Z [INFO] consul-server-connection-manager: discovered Consul servers: addresses=[172.26.1.58:30014]
2023-02-23T06:49:10.178Z [INFO] consul-server-connection-manager: current prioritized list of known Consul servers: addresses=[172.26.1.58:30014]
2023-02-23T06:49:10.490Z [INFO] consul-server-connection-manager: ACL auth method login succeeded: accessorID=8b43f8bd-bfe6-9a78-1682-65a33ce18744
2023-02-23T06:49:10.491Z [INFO] consul-server-connection-manager: connected to Consul server: address=172.26.1.58:30014
2023-02-23T06:49:10.493Z [INFO] consul-server-connection-manager: updated known Consul servers from watch stream: addresses=[100.118.198.86:30014]
2023-02-23T06:49:10.678Z INFO controller-runtime.metrics metrics server is starting to listen {"addr": "0.0.0.0:9444"}
2023-02-23T06:49:10.678Z INFO controller-runtime.webhook registering webhook {"path": "/mutate"}
2023-02-23T06:49:10.678Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-servicedefaults"}
2023-02-23T06:49:10.678Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-serviceresolver"}
2023-02-23T06:49:10.679Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-proxydefaults"}
2023-02-23T06:49:10.774Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-mesh"}
2023-02-23T06:49:10.774Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-exportedservices"}
2023-02-23T06:49:10.775Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-servicerouter"}
2023-02-23T06:49:10.775Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-servicesplitter"}
2023-02-23T06:49:10.775Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-serviceintentions"}
2023-02-23T06:49:10.775Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-ingressgateway"}
2023-02-23T06:49:10.775Z INFO controller-runtime.webhook registering webhook {"path": "/mutate-v1alpha1-terminatinggateway"}
2023-02-23T06:49:10.776Z INFO starting metrics server {"path": "/metrics"}
2023-02-23T06:49:10.776Z INFO attempting to acquire leader lease consul/consul-controller-lock...
2023-02-23T06:49:10.776Z INFO controller-runtime.webhook.webhooks starting webhook server
2023-02-23T06:49:10.777Z INFO controller-runtime.certwatcher Updated current TLS certificate
2023-02-23T06:49:10.777Z INFO controller-runtime.webhook serving webhook server {"host": "", "port": 8080}
2023-02-23T06:49:10.777Z INFO controller-runtime.certwatcher Starting certificate watcher
2023-02-23T06:49:10.875Z INFO successfully acquired lease consul/consul-controller-lock
2023-02-23T06:49:10.875Z INFO controller.terminatinggateway Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "TerminatingGateway", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.875Z INFO controller.terminatinggateway Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "TerminatingGateway"}
2023-02-23T06:49:10.876Z INFO controller.exportedservices Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ExportedServices", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.876Z INFO controller.exportedservices Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ExportedServices"}
2023-02-23T06:49:10.876Z INFO controller.servicesplitter Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceSplitter", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.979Z INFO controller.servicesplitter Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceSplitter"}
2023-02-23T06:49:10.876Z INFO controller.ingressgateway Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "IngressGateway", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.979Z INFO controller.ingressgateway Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "IngressGateway"}
2023-02-23T06:49:10.876Z INFO controller.endpoints Starting EventSource {"reconciler group": "", "reconciler kind": "Endpoints", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.979Z INFO controller.endpoints Starting Controller {"reconciler group": "", "reconciler kind": "Endpoints"}
2023-02-23T06:49:10.877Z INFO controller.serviceresolver Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceResolver", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.979Z INFO controller.serviceresolver Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceResolver"}
2023-02-23T06:49:10.876Z INFO controller.servicerouter Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceRouter", "source": "kind source: /, Kind="}
2023-02-23T06:49:10.979Z INFO controller.servicerouter Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceRouter"}
2023-02-23T06:49:10.876Z INFO controller.serviceintentions Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceIntentions", "source": "kind source: /, Kind="}
2023-02-23T06:49:11.075Z INFO controller.serviceintentions Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceIntentions"}
2023-02-23T06:49:10.876Z INFO controller.mesh Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "Mesh", "source": "kind source: /, Kind="}
2023-02-23T06:49:11.075Z INFO controller.mesh Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "Mesh"}
2023-02-23T06:49:10.877Z INFO controller.servicedefaults Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceDefaults", "source": "kind source: /, Kind="}
2023-02-23T06:49:11.075Z INFO controller.servicedefaults Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceDefaults"}
2023-02-23T06:49:10.878Z INFO controller.proxydefaults Starting EventSource {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ProxyDefaults", "source": "kind source: /, Kind="}
2023-02-23T06:49:11.076Z INFO controller.proxydefaults Starting Controller {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ProxyDefaults"}
2023-02-23T06:49:11.574Z INFO controller.exportedservices Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ExportedServices", "worker count": 1}
2023-02-23T06:49:11.674Z INFO controller.ingressgateway Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "IngressGateway", "worker count": 1}
2023-02-23T06:49:11.674Z INFO controller.mesh Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "Mesh", "worker count": 1}
2023-02-23T06:49:11.676Z INFO controller.terminatinggateway Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "TerminatingGateway", "worker count": 1}
2023-02-23T06:49:11.676Z INFO controller.serviceintentions Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceIntentions", "worker count": 1}
2023-02-23T06:49:11.677Z INFO controller.proxydefaults Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ProxyDefaults", "worker count": 1}
2023-02-23T06:49:11.677Z INFO controller.serviceresolver Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceResolver", "worker count": 1}
2023-02-23T06:49:11.679Z INFO controller.endpoints Starting workers {"reconciler group": "", "reconciler kind": "Endpoints", "worker count": 1}
2023-02-23T06:49:11.679Z INFO controller.servicedefaults Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceDefaults", "worker count": 1}
2023-02-23T06:49:11.679Z INFO controller.servicesplitter Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceSplitter", "worker count": 1}
2023-02-23T06:49:11.679Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-23T06:49:11.774Z INFO controller.servicerouter Starting workers {"reconciler group": "consul.hashicorp.com", "reconciler kind": "ServiceRouter", "worker count": 1}
2023-02-23T06:49:12.175Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-23T06:49:12.280Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-23T06:49:12.393Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
2023-02-23T06:49:12.399Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-23T06:49:12.495Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-23T06:49:12.501Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-23T06:49:40.455Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-23T17:11:31.572Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-23T17:11:31.578Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-23T17:11:31.584Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-23T17:11:31.589Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-23T17:11:31.589Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-23T17:11:31.593Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-23T17:11:31.774Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
2023-02-24T03:33:51.764Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-24T03:33:51.771Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-24T03:33:51.776Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-24T03:33:51.781Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-24T03:33:51.781Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-24T03:33:51.785Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-24T03:33:51.794Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
2023-02-24T04:25:41.792Z INFO controller-runtime.certwatcher Updated current TLS certificate
2023-02-24T04:25:41.792Z INFO controller-runtime.certwatcher Updated current TLS certificate
2023-02-24T13:56:11.957Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-24T13:56:11.963Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-24T13:56:11.968Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-24T13:56:11.974Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-24T13:56:11.974Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-24T13:56:11.978Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-24T13:56:11.997Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
2023-02-25T00:18:32.149Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-25T00:18:32.156Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-25T00:18:32.161Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-25T00:18:32.190Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-25T00:18:32.190Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-25T00:18:32.194Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-25T00:18:32.203Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
2023-02-25T02:01:43.688Z INFO controller-runtime.certwatcher Updated current TLS certificate
2023-02-25T02:01:43.688Z INFO controller-runtime.certwatcher Updated current TLS certificate
2023-02-25T10:40:52.342Z INFO controller.endpoints retrieved {"name": "consul-consul-connect-injector", "ns": "consul"}
2023-02-25T10:40:52.347Z INFO controller.endpoints retrieved {"name": "consul-consul-dns", "ns": "consul"}
2023-02-25T10:40:52.353Z INFO controller.endpoints retrieved {"name": "nginx-client-cluster", "ns": "nginx"}
2023-02-25T10:40:52.358Z INFO controller.endpoints retrieved {"name": "static-client", "ns": "default"}
2023-02-25T10:40:52.358Z INFO controller.endpoints registering service with Consul {"name": "static-client", "id": ""}
2023-02-25T10:40:52.391Z INFO controller.endpoints registering proxy service with Consul {"name": "static-client-sidecar-proxy"}
2023-02-25T10:40:52.400Z INFO controller.endpoints retrieved {"name": "kubernetes", "ns": "default"}
Submitted PR for changes in the documentation
https://github.com/hashicorp/consul/pull/16430
https://github.com/hashicorp/consul/pull/16430/commits/9dc24ffd1bf307c40cea44cd3bc294939a1532a7
Closing this issue as the PR is accepted now.
Community Note
Overview of the Issue
I have two production K8 clusters created using Kops controller. The two K8 clusters run on two seperate VPC's on AWS and its VPC peered between them. I wanted to install consul on both clusters but with only single consul datacenter. I followed the documentation to deploy the same. Everything is fine on the first K8 cluster which acts as the server. But with the second K8 cluster ( the client ) the deployment fails. The consul-server-acl-init and consul-connect-injector pods ends up in crashloop always.
Reproduction Steps
Firstly, you must have two working k8 clusters.
Prepared the Helm release names as environment variables for both the server and client install:
On server cluster:
Helm chart and its custom values used on the Server K8 cluster:
To deploy, first generate the Gossip encryption key and save it as a Kubernetes secret.
Installed on the first cluster:
Extracted CA certificate and ACL bootstrap token generated during installation on the server k8 cluster:
On client k8 cluster:
Applied the credentials extracted from the first cluster to the second cluster:
Where 172.26.1.58 is one of the node IPs and 31608 is the nodePort of the server k8 cluster.
Then, proceeded with the installation of the second cluster.
At this point:
Status of server cluster:
Status of client cluster:
You can see consul-connect-injector and consul-server-acl pods are having crashloop
Logs
Logs from consul-connect-injector and consul-server-acl pods are:
Expected behavior
The pods should not have crashloop and the client should able to join the server cluster over gRPC port 8502
Environment details
If not already included, please provide the following:
consul-k8s
version:1.14.4
using helm chart1.0.4
values.yaml
used to deploy the helm chart:already provided above
Additionally, please provide details regarding the Kubernetes Infrastructure, as shown below:
Kubernetes version: Server cluster:
v1.23.9
Client cluster:v1.23.16
Cloud Provider:
K8s created using Kops on AWS
Networking CNI plugin in use:
Calico
Additional Context
The two k8 clusters existing in two seperate VPCs but are VPC peered and can communicate each other and the whole CIDRs are whitelisted for all ports.
The doc I followed is https://developer.hashicorp.com/consul/docs/k8s/deployment-configurations/single-dc-multi-k8s