k3s-io / docs

k3s Documentation
http://docs.k3s.io
17 stars 128 forks source link

Traefik is not segregating Ingresses by lbpool #292

Closed simonfoilen closed 2 months ago

simonfoilen commented 3 months ago

Hi all,

When I first read the doc https://github.com/k3s-io/docs/blob/main/docs/networking/networking-services.md , it seemed to indicate that Traefik is on all the nodes per default, but if we enable lbpools, then it would allow to segregate per lbpools.

So I spent a couple hour trying to make it work that way and I finally opened an issue on the code project https://github.com/k3s-io/k3s/issues/10434 . The maintainer told me it was supposed to do it per that doc. Thus, I tried again to make it work, I downloaded the k3s code to see if I could find anything and actually, that is not possible to make it segregate by lbpool.

It is totally fine that it is not supported, but since newcomers like me and experts like the k3s team are confused on the expected behavior, it would be great to add that precision:

Thanks a lot. If you want me to provide a PR, let me know, I can try

brandond commented 3 months ago

I don't know what you mean by "segragate by lbpool". I think you're getting confused by some of the layers here.

The Traefik helm chart creates a single ingress controller Deployment, which is exposed via a single LoadBalancer Service. Traefik uses the IPs of this Service to set the addresses that it advertises in the Ingress resource's status field. When using ServiceLB as the LoadBalancer controller, the default behavior is to advertise all node's IPs in the LoadBalancer Service status, so this is what is propagated through into the Ingress status by Traefik.

It sounds like you want to:

  1. Set up multiple LoadBalancer Services for Traefik, instead of just one.
  2. Restrict each LoadBalancer to running on a subset of nodes.
  3. Bind each Service to a different IngressClass, so that you are able to set the IngressClassName on your Ingress resources in order to select which LoadBalancer service's IPs are advertised in the Ingress status.

Is that correct?

If I am understanding your ask properly, I don't think this is possible to accomplish in a single installation of Traefik. You would need to install multiple copies of Traefik, each with its own LoadBalancer Service and IngressClass. This is a limitation of Traefik, not of K3s or the Traefik Helm Chart.

Ref:

simonfoilen commented 3 months ago

It sounds like you want to

No, I don't want to, but yes, that is what I was expecting based on the documentation. I was just testing k3s, I was following the documentation and when reading about lbpool, it looked as if we would be able to put an "lbpool" selector on an Ingress for Traefik to pick it up. I thought it would make sense for clusters with 1000 nodes and 10000 Ingresses to not have every single node with those same 10000 ingresses configured. I was just testing that out and ; I don't need it.

When I saw it wasn't working, I created that issue and you said it should work while sending me the same documentation link, but after reading the k3s code, I saw it wasn't possible and that is not what we should expect from the documentation. You then confirmed I was right.

So, we just need to update the doc so that no one else spend hours like me testing for something that k3s is just not doing out of the box (and that is perfectly fine that it is not doing it since that is complex and switching from one mode to the other would break existing deployments)

Thanks,

brandond commented 3 months ago

I thought it would make sense for clusters with 1000 nodes and 10000 Ingresses to not have every single node with those same 10000 ingresses configured.

ServiceLB would never be used on a cluster of that scale. ServiceLB is a very rudimentary LoadBalancer controller, intended to be just enough to make LoadBalancer services work out-of-the-box on K3s without any additional work on the part of the user.

This is touched on in the docs: https://docs.k3s.io/networking/networking-services#service-load-balancer

Upstream Kubernetes allows Services of type LoadBalancer to be created, but doesn't include a default load balancer implementation, so these services will remain pending until one is installed. Many hosted services require a cloud provider such as Amazon EC2 or Microsoft Azure to offer an external load balancer implementation. By contrast, the K3s ServiceLB makes it possible to use LoadBalancer Services without a cloud provider or any additional configuration.

At any significant scale a real load-balancer controller would be deployed instead. Most real load-balancers (MetalLB, kube-vip, F5, Brocade, AWS ELB/ALB, and so on) will use Virtual IPs or some other externally managed address for the load-balancer, and that is what you would see show up in the Service (and Ingress) status.

it looked as if we would be able to put an "lbpool" selector on an Ingress for Traefik to pick it up

You seem to be continuing to confuse Ingress resource with Service resources. I don't think the docs suggest that you can use lbpool annotations on Ingresses - just on Services.

brandond commented 3 months ago

I will also note that the Ingress model by design funnels all traffic through a single entry point (Service). Reusing a single entry point for multiple websites by examining host headers or TLS SNI handshakes is the entire point of using an Ingress controller. If you want to instead expose different websites via different entry points into your cluster, the Service abstraction may be what you want.

simonfoilen commented 3 months ago

You seem to be continuing to confuse Ingress resource with Service resources. I don't think the docs suggest that you can use lbpool annotations on Ingresses - just on Services

Yes the documentation is confusing. Let me show you the thought process for a newcomer reading: https://github.com/k3s-io/docs/blob/main/docs/networking/networking-services.md

My suggestion is just to make it very explicit that this "Service Load Balancer" section about lbpool is not for Traefik. (see the description of this issue)

If you want to instead expose

Again, I don't want anything (besides testing what is coming out of the box of k3s by following the documentation as a tutorial). We don't need to discuss all the possible use cases and the best practices. I am just talking about someone testing k3s by following its documentation to see what it is capable of and that person spending too much time on that section because it is unclear.

Personally, I finally created a 3 nodes clusters, I don't mind having all the Ingresses on all the nodes via the default Traefik and I am happy with k3s. I just don't want any other newcomer to spend hours like me following the documentation and be stuck on that lbpool stuff that is not working (per design) with Traefik, but that we can think that it should be working from how the current documentation is done.

That is a very small documentation change. Max 10 minutes. I don't know why this discussion keeps getting longer with all the possible use cases. All I am talking about is the actually supported use case.

brandond commented 3 months ago

My suggestion is just to make it very explicit that this "Service Load Balancer" section about lbpool is not for Traefik.

It can be for Traefik, or at least for the Traefik LoadBalancer Service. I showed an example of how to do that at https://github.com/k3s-io/k3s/issues/10434#issuecomment-2198443482. It is NOT for Ingress resources, and it is not implied or stated anywhere that it is.

Are you suggesting that we need to explain to users that Services and Ingresses are not the same thing?

simonfoilen commented 3 months ago

yes it can be for Traefik. That was the second bullet point in the current issue's description:

The problem is that I spent hours trying the 3rd point that is not supported, but it looks at it would be supported per the documentation I read and the documentation you pointed me to when I opened the initial issue. So, yeah, everyone is confused by that part of the documentation and that is why I suggest to put these 3 bullet points (of course, not as-is, but put correctly in the documentation)

Are you suggesting that we need to explain to users that Services and Ingresses are not the same thing?

No. The Kubernetes documentation explains it already. I am suggesting to explain what K3S Traefik out of the box supports (1 and 2) and more importantly, what it doesn't (3)

brandond commented 3 months ago

cannot limit Ingresses (by limiting Traefik) on different lbpools at the same time

Correct, you cannot split a single service across multiple pools. There is a single Service for Traefik, and it follows that the Ingress Controller is limited to advertising the addresses of the cluster members used by that single Service. That means all the nodes by default, or the lbpool members if you chose to add those labels to your nodes and/or service.

The Traefik ingress controller deploys a LoadBalancer Service that uses ports 80 and 443. By default, ServiceLB will expose these ports on all cluster members To select a particular subset of nodes to host pods for a LoadBalancer, add the enablelb label to the desired nodes, and set matching lbpool label values on the Nodes and Services.

I am suggesting to explain what K3S Traefik out of the box supports

This is not a "K3s Traefik" thing. Traefik will only set Ingress status based on a single Service, period, regardless of whether you use our packaged chart, or install it directly from upstream. You can work around this limitation by installing multiple copies of Traefik configured to use unique IngressClassNames, and configure your Ingress resources appropriately... but that is a fairly advanced use case, and by the time someone wants to do that I would generally expect them to understand how ingresses and services are tied together, as well as how to use multiple IngressClass definitions to allow different Ingress controllers to manage different Ingress resources.

brandond commented 3 months ago

Does the PR I linked explain the link between Ingress status, Traefik's LoadBalancer Service, and ServiceLB lbpools a little better?

simonfoilen commented 3 months ago

This is not a "K3s Traefik" thing. Traefik will only set Ingress status based on a single Service, period, regardless of whether you use our packaged chart, or install it directly from upstream

It is K3S choice to install a single global service instead of installing 1 service per lbpool value (that could have been a possibility). And that is a correct choice.

Does the PR I linked explain the link between Ingress status, Traefik's LoadBalancer Service, and ServiceLB lbpools a little better?

Yes that is great. It does a link with the lbpool section by saying what it can be used for (case 2), so there is no guessing that the section about lbpool can do case 2 or case 3, but really only case 2.

thanks a lot for your time :)

brandond commented 3 months ago

It is K3S choice to install a single global service instead of installing 1 service per lbpool value

No, it's not. This is a Traefik design limitation. I even linked you to the Traefik project's docs and chart values where Traefik makes it clear that it only supports copying status from a single Service. We could add as many services as you could imagine pointing at the Traefik deployment, and Traefik will still only support copying the status from a single one. And again the LBPool creation is on on the end-user... labeling more nodes for different pools does not magically trigger the Traefik helm chart to create more Services, even if that was something that would be useful to do.

simonfoilen commented 3 months ago

But you said "You would need to install multiple copies of Traefik, each with its own LoadBalancer Service and IngressClass"

So that is doable, no?

brandond commented 3 months ago

That is is definitely something you could choose to do for yourself. There is no circumstance under which we would proactively install multiple parallel releases of the Traefik helm chart and try to manage that on behalf of the user. While it can theoretically be done, I don't think Traefik's docs offer any specific steps for how to do this, and it would be the opposite of the "Lightweight Kubernetes" philosophy that we we advertise.

K3s makes what we refer to in our readme as "opinionated choices" for our default components, such that k3s works out of the box for most simple use cases, and allows simple configuration changes for slightly more advanced installation. If you need something drastically different, you are welcome to disable specific packaged components and deploy something else that better suits your unique needs - or just use a different Kubernetes distro entirely if you don't like any of our defaults.

simonfoilen commented 3 months ago

I am not sure why you are telling me that. I just said that it is something it could do. So now that the doc is clearer, we will all know k3s is not doing that possibility

brandond commented 3 months ago

sure, it's just code. it could also stand up and dance on its head.

simonfoilen commented 3 months ago

That is why I like code, we can do all kinds of stuff with it. At least now the doc doesn't suggest it does more than it can.

thanks again. Cheers :)