Open plombardi89 opened 6 years ago
I started examining this problem shortly after releasing Kubernaut. Unfortunately, I have discovered there is no simple ideal solution to this problem.
Kubernaut does not support type: LoadBalancer
services because it presents a serious Quality-of-Service issue for multiple users. Here are the facts:
LoadBalancer
service on AWS creates an Amazon ELB. LoadBalancer
services that can be created and therefore we cannot limit the number of Amazon ELB instances that are created.kubectl
they see their LoadBalancer
service stuck in a pending
state without explanation. To these users the system appears broken.This is a small problem and is internal to our operation of the service.
Amazon ELB instances are NOT free. They are charged at $0.025/hr (~$18.30/mo) just to run. Data is charged $0.008/GB IN/OUT data. At a minimum, assuming 1 user is consistently using the service for a whole month we have a spend of $18/user just for the ELB. Data is cheap, and they would need to pump a lot of data through the system to cost us a lot of money, but lets once again assume humans are dumb or malicious and someone decides to upload a 100MB archive in a loop through their service. We need to have monitoring in place that allows us to basically cut off the user. That is engineering effort to us to prevent someone from causing financial harm.
I examined whether it was possible to disable the ELB provisioning functionality in the AWS cloud provider integration for Kubernetes and the answer is that it is not possible. My idea was to disable it and replace it with a different service controller that would talk to say a HAProxy or Nginx cluster we ran that could act as a multi-tenant TCP load balancer.
A brief technical explanation:
When kube-controller-manager
and kubelet
come up you specifiy a cloud provider via a CLI switch. In our case we specify aws
which enables the AWS integration for Kubernetes and makes it possible to run on that cloud and use that clouds API to bootstrap the cluster machinery (e.g. inspects the EC2 metadata service). It also enables stuff like using EBS volumes for storage.
There is no exposed way to override the services-controller
that I could find. The services-controller
is responsible for actually orchestrating the creation of the LoadBalancer (create load balancer, create firewall rules, add/remove nodes to the backend pool).
If we want this kind of customizability it seems we need to get involved in Kubernetes development process:
We need to consult the Kubernetes team on the best path forward.
We need to modify the Kubernetes code and get it shipped in a release OR alternatively find someone to do it (e.g. ask for an enhancement in an Issue).
Deployment tools, specifically kubeadm
in our case, need to be updated to expose the new configuration mechanism for specifying the new services-controller.
Doing additional implementation research. To avoid hard AWS limits we will need to configure the following:
DisableSecurityGroupIngress
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L441
ElbSecurityGroup
https://github.com/kubernetes/kubernetes/blob/master/pkg/cloudprovider/providers/aws/aws.go#L446
We will also need to adjust our usage of kubeadm
during bootstrap:
https://kubernetes.io/docs/admin/kubeadm/
Specifically see "Cloudprovider integrations (experimental)"
Suggested approach I learned of at Velocity NYC from Kelsey Hightower
- Use RBAC to limit to 1 if that's desired. this would make users not quite admins in kubernetes cluster but thats probably ok as it is a very small corner.
- write control loop that monitors kubernetes api and does the endpoint -> service mapping magic. Watch for "type: LoadBalancer" services.
- use PATCH mechanism to change the status of the LoadBalancer with the public IP that was created via external process (e.g. a multi-tenant nginx tcp load balancer)
Discovered this while poking around for AWS and Kubernetes stability issues... this would be an additional implementation problem: https://github.com/kubernetes/kubernetes/issues/29298
Kubernaut currently does not support LoadBalancer services but many people use
type: LoadBalancer
in their service manifests. The current work around is to usetype: NodePort
but this means changing manifests specifically for Kubernaut which is undesirable.