Metaswitch / clearwater-docker

Docker integration for Project Clearwater
Other
41 stars 64 forks source link

User configurable ports for X service - (mainly ETCD) #63

Open dmoneil2 opened 7 years ago

dmoneil2 commented 7 years ago

Description

When deploying clearwater docker on kubernetes , we are using kubernetes services to decouple containers. In this way the the kubernetes services provides enhanced decoupling of IPs as well as load balancing.

We have decoupled all of the services expect for ETCD.

Symptoms

We cannot create a service with port 4000,4001,2379,2378 (default etcd ports) as this conflicts with the kubernetes infrastructure (port already open).

Need to be able to configure this

Impact

Problems with scaling on K8s using inbuilt service method. https://kubernetes.io/docs/user-guide/services/

Release and environment

Centos 7, k8s 1.5 - clearwater docker stable

Steps to reproduce

mirw commented 7 years ago

Thanks for raising this!

Am I right in understanding that the issue is with the etcd service (rather than the other services' use of etcd)?

If so, the etcd service isn't actually built by Clearwater - we use an off-the-shelf container provided by CoreOS - are we saying that that is incompatible with Kubernetes? Assuming so, do you know if there is a Kubernetes-compatible etcd service that we could replace this with?

dmoneil2 commented 7 years ago

The problem as far as i can see, is with clearwater python script use of the python-etcd. It does not provide a user configurable port as far as i can see, so we are stuck using ETCD on the default port.
This is obviously a problem if you are running more than one etcd instance, on different ports Clearwater does not account for this.

mirw commented 7 years ago

Thanks! So just to be clear, the issue isn't that we need to be able to create an etcd service with configurable ports, but that we need to be able to configure the ports that the users of the etcd service point to?

So if we add a configuration option to change the hard-coded port 2380 in clearwater-etcd, that would solve the problem?

mirw commented 7 years ago

Assigning just to be clear on the next action - confirm the question above.

dmoneil2 commented 7 years ago

Premise: All client and server ports should be configurable. You can see all of these problems, by deploying 2 copies of IMS on the same system.

Example: I'm giving ETCD as the first major problem (more may come after fixing this).
You cannot currently deploy multiple etcd instances. ETCD uses 2379,2380, 4000,4001. For internal/external connectivity and http/https connections.

In our use case on kubernetes: We cannot use a kubernetes service (iptables port forward) to abstract the etcd instance (decoupling IPs), as this conflicts with the kubernetes infrastructure ports for ETCD.

When defining kubernetes service that exposes the ETCD ports, this fails due to the PORT conflict with the ETCD cluster already running on the default kubernetes infrastructure.

The immediate solution, you might think is, to change the kubernetes ports. This does not fix the issue with being able to run multiple instances of IMS as really you are just moving the port problem around.

plwhite commented 7 years ago

When running Clearwater in Kubernetes, we've just left the ports as the default, and then the various clearwater components can access the service on those ports on the Kubernetes network using the hostname of the service (i.e. http://etcd:4001). It sounds like you might have a requirement to use the host network rather than a service? Or is the issue that you need the etcd cluster to be an external service?

Here's an example service file that shows you what we've been doing.

apiVersion: v1
kind: Service
metadata:
  name: etcd
  labels:
    instance-type: etcd-pod
spec:
  ports:
  - name: "etcd-client"
    port: 2379
  - name: "etcd-server"
    port: 2380
  - name: "4001"
    port: 4001
  selector:
    instance-type: etcd-pod
  clusterIP: None

I'll let @mirw comment on the more general point of allowing etcd port to be configurable.

dmoneil2 commented 7 years ago

Hey Peter,

thanks for your feedback, I really appreciate it. I am going to challenge you on a few points and hopefully you can make me aware of your deployment decisions and where I'm mis-understanding/wrong.

Premise: It is conceivable that a customer may want to run multiple copies of IMS. such as Testing, pre-production, production, or even multiple deployments for individual domains.

In your branch on 'kubernetes_new' you have a hard dependency on weave (built in DNS server). What happens in the use case of multiple copies of IMS on the same DNS domain?
Furthermore what happens if the customer is not using weave, maybe flannel instead?
Your solution does not work in this case, as far as I can see, flannel has no such support.

How do you envision multiple deployments using the same ports? Leaving them at defaults will not allow for multiple copies to be exposed via kubernetes services, specifically bono.

Futhermore, by using DNS vs K8s services, you are loosing out on many of the advantages of k8s services. While conceivably DNS can provide almost same functionality. k8s service allow for greater decoupling and load balancing.. Also its easier to update k8s services than DNS.

plwhite commented 7 years ago

I'm using the DNS provided by Kubernetes, with a flannel network. I'm actually testing a DR scenario with two Clearwater deployments, which works by each deployment being in its own namespace (and optionally being in its own K8s cluster). That means that each can resolve "etcd" to be the etcd cluster in the same namespace, i.e. the local etcd cluster, but management tools can resolve instances in both by putting in the full path (so for example, sprout instances in site1 could be found by looking up sprout.ns1.svc.domain from any namespace, or by just looking up sprout from site1).

For now, I am exposing bono in a very crude way, just manually configuring DNS to the IP/port that the single instance is listening on; in a production deployment you'd have a load balancer IP in front of each bono instance, and your DNS would be configured to point to that IP.

mirw commented 7 years ago

@plwhite, thanks for your comments!

@dmoneil2, does that give you what you need? I can add configurable etcd ports if needed, but if the standard Kubernetes way doesn't require it, it keeps things simpler not to. Please let me know!

mirw commented 7 years ago

@dmoneil2, please can you let me know whether you still need this fixed, or whether the approach @plwhite outlines above solves your problem? (I think we're both keen to get this closed down before tomorrow.)

dmoneil2 commented 7 years ago

Sorry guys, I don't think you understand our deployment model correctly.
Currently we cannot decouple etcd instances via kubernetes services, as ports described above conflict with the etcd instance running on the infrastructure.
This is a hard stop if we continue in this direction of using kubernetes services to decouple the containers (vision of kubernetes deployment approach)

https://kubernetes.io/docs/user-guide/services/

Lets pick it up tomorrow

plwhite commented 7 years ago

I'm a bit lost, so let me dig a bit deeper into your setup.

Let's say that your K8s node has IP 192.168.0.123. Then etcd instance for the infrastructure is listening on port 2379 (etc.) on IP 192.168.0.123.

Now you create an etcd pod with etcd running in a container, and that pod gets IP 10.1.2.3 (say). That etcd container is listening on port 2379 on IP 10.1.2.3. The IP 10.1.2.3 might or might not be accessible from the K8s node, depending on how your K8s networking is configured.

What's the issue that causes?

dmzoneill commented 7 years ago

issue has resolved itself, please close bug

mirw commented 7 years ago

Thanks! Is there anything we should be adding to our docs in case others hit this in future?

eleanor-merry commented 7 years ago

Ported to https://github.com/Metaswitch/project-clearwater-issues/issues/16