hashicorp / consul-helm

Helm chart to install Consul and other associated components.
Mozilla Public License 2.0
419 stars 386 forks source link

[question] Multi/External Datacenter concept #85

Closed yacut closed 5 years ago

yacut commented 5 years ago

I would like to describe my understanding and the current state of the project on this topic. It would be nice to discuss how best to use Hashicorp Consul with Kubernetes across multiple Datacenters. Unfortunately at the moment there are very few guides on the topic "Multi Datacenter Consul Setup on Kubernetes", so I will try to describe the problems that I discovered and would like to discuss them and find the right solutions.

Federation Join

The first problem I see is "Federation with the WAN Gossip Pool". Imagine we have two independent networks, each has a k8s cluster with the same CIRD for pods (10.0.0.0/14). k8s nodes of the first network will be created in 172.100.0/24 range and nodes of the second network in 174.200.0/24. For both clusters we use the current helm chart to bootstrap Consul cluster.

If we are going to join these clusters, it would be impossible, because the Consul servers are not reachable from the outside and each consul server has only an internal pod IP address:

screenshot 2018-12-10 at 10 48 05

A possible solution to this problem might be #27 feat: enable consul-servers to be accessed externally, but what if we didn’t want the Consul servers to be accessible with an external ip address. Then we have to open the consul server ports on the host machine (k8s node) that does not overlap with the consul client ports that are already open on the host. I created a PR for this: https://github.com/hashicorp/consul-helm/pull/84 In addition, we also need to allow connections between networks (aka firewall rules) and to configure the custom iptables for target network on each cluster: https://github.com/bowei/k8s-custom-iptables

I'm not sure that k8s-custom-iptables should be added to the current chart.

If we apply these changes, we can join two clusters.

screenshot 2018-12-10 at 10 50 55

consul-k8s tool

Catalog Sync

consul-k8s tool synchronises k8s services with consul. The problem here is that it is intended only for single cluster and writes only internal pod addresses into the consul service catalog. Also if we create a NodePort-service, only the entrypoints of this service will be written to the catalog. So we cannot reach a service from another datacenter or from any non-k8s vm on the same network.

Consul Connect - Enterprise

consul-k8s tool also inserts a sidecar consul connect proxy into a pod and provide a secure connection to the service. The problem here is the same - it is intended only for single cluster, since the connect proxy service is registered with a private k8s address (pod CIDR range) in the current data center.

screenshot 2018-12-10 at 09 10 39

A possible solution for this problem would be to register the proxy on a random host port or create a k8s NodePort-service for this proxy and sync it to the catalog with an external IP address:

screenshot 2018-12-10 at 14 50 36

P.S. We really would like to combine several datacenters into one service mesh and use all consul features across these datacenters, but at this stage of the project it is simply impossible.

yacut commented 5 years ago

@mitchellh @adilyse Could you tell me please in which direction this project is going? As a solution only for a standalone k8s cluster or something more than that?

yacut commented 5 years ago

I understood from this video that virtual machines could connect to the k8s pods, but how is this possible with current setup if the pods are registered with an internal ip?

yacut commented 5 years ago

I'm closing this issue after two months without an answer. Very sad that hashicorp doesn't want to push the project forward.

lkysow commented 5 years ago

Hi Roman, Sorry we never got back to you on this and thanks for the very detailed question. I'm going to re-open this because we answer this.

lkysow commented 5 years ago

You are correct that it's not possible to do multi-DC right now unless you are able to route to Pod IPs (for example using Alias IPs in GKE). We do intend to make this possible, either by Consul itself implementing https://github.com/hashicorp/consul/issues/6356 (multi-DC through mesh gateways) or by mapping ports and advertising the node IPs on the WAN.

A possible solution to this problem might be #27 feat: enable consul-servers to be accessed externally, but what if we didn’t want the Consul servers to be accessible with an external ip address. Then we have to open the consul server ports on the host machine (k8s node) that does not overlap with the consul client ports that are already open on the host. I created a PR for this: #84 In addition, we also need to allow connections between networks (aka firewall rules) and to configure the custom iptables for target network on each cluster: https://github.com/bowei/k8s-custom-iptables

Do you need custom IP tables because you're advertising the node's internal IP that's not routable from the other DC? I think we'd prefer allowing users to advertise the external IP rather than requiring custom IP tables rules.

consul-k8s tool synchronises k8s services with consul. The problem here is that it is intended only for single cluster and writes only internal pod addresses into the consul service catalog

I think this is incorrect, we actually never register the pod IPs. What we register depends on the service type. It is intended for external access, not internal to the cluster (in which case you'd probably be using Kube DNS).

Also if we create a NodePort-service, only the entrypoints of this service will be written to the catalog. So we cannot reach a service from another datacenter or from any non-k8s vm on the same network.

We register the NodeIP and Port right now so this should work if those IPs are routable.

consul-k8s tool also inserts a sidecar consul connect proxy into a pod and provide a secure connection to the service. The problem here is the same - it is intended only for single cluster, since the connect proxy service is registered with a private k8s address (pod CIDR range) in the current data center.

You are correct. This is why we added mesh gateways in Consul 1.6 (which I realize happened quite a while after you opened this issue!).

So to sum up, we're getting closer to making multi-dc possible via the helm chart as is, and it's possible right now if you make some manual edits to the helm chart to expose the servers and use mesh gateways.

yacut commented 5 years ago

Thank you for your answer. Unfortunately, you answered too late and I decided to use Istio for my purposes. Perhaps in the future I will try the solution from Consul.

lkysow commented 4 years ago

For others reading this, multi-cluster federation is supported in Consul 1.8.0. See https://www.consul.io/docs/k8s/installation/multi-cluster/overview for more details.

ndhanushkodi commented 3 years ago

Update to @lkysow's comment above, the new docs page link for multi-cluster federation is at https://www.consul.io/docs/k8s/installation/multi-cluster