kubernetes-sigs / aws-load-balancer-controller

A Kubernetes controller for Elastic Load Balancers
https://kubernetes-sigs.github.io/aws-load-balancer-controller/
Apache License 2.0
3.82k stars 1.41k forks source link

Reuse target groups across multiple ingress objects for the same service #3680

Closed zoltanpeto closed 1 month ago

zoltanpeto commented 1 month ago

Is your feature request related to a problem?

We are building a multi-tenant system, where

Currently, LBC creates 1 rule and 1 target group for each tenant within a single ALB (they all use the same group.name), with the targets shared across the target groups.

Our problem is, that while ALB has a soft limit of 100 rules per ALB, its target group limit of 100 is a hard one, so we cannot possibly support more than 100 tenants with a single ALB instance.

image

Describe the solution you'd like

We were wondering if it's possible for LBC to create a target group for each service referenced across all ingress objects, rather than creating one target group for each ingress object. This way, we can raise the soft limit of 100 rules/ALB as we grow, without being restricted by the number of target groups.

Describe alternatives you've considered

Currently we have two alternatives:

andreybutenko commented 1 month ago

Hi @zoltanpeto, thanks for the question and for sharing your usecase!


It sounds like all your ingresses have the same target and you don't need a separate target group for each ingress. Another alternative to consider is the alb.ingress.kubernetes.io/actions.forward* annotations. You can configure this annotation on all your ingresses such that all rules will forward to a single target group :) The downside of this option is that you will need to manage the target group outside of the controller.

For example:

    alb.ingress.kubernetes.io/actions.forward-single-tg: >
      {"type":"forward","targetGroupARN": "arn-of-your-target-group"}

Otherwise, the alternative you described with different group.name annotations will work with LBC. From an operations and architectural perspective, this sounds like a good direction for your usecase. Using a single ALB for all your tenants involves availability trade-offs:

  1. If the ALB has an issue, it is a single point of failure for all your tenants
  2. Your tenants may encounter "noisy neighbor" issues - for example, if one tenancy experiences a large burst in traffic, the common ALB can take time to scale up

Of course, the architecture depends on your needs :) If you'd like to learn more, AWS has a whitepaper on multi-tenancy for SaaS applications, and a general Well-Architected framework.


After consulting with a teammate, it is not feasible for the LBC to implement the solution you described. Today, each ingress object is treated as an independent stack with its own configuration and lifecycle and resources, so that changes to one does not affect the others.


Please post if you have any other questions or concerns, thank you :)

/kind feature

zoltanpeto commented 1 month ago

Hi @andreybutenko ,

Thanks for coming back on this, your reply helped a lot! On Friday I tested your suggestion of using alb.ingress.kuberenetes.io/actions.forward-single-tg annotation. As we're using Terraform to deploy both shared and tenant-specific resources in two different configs, this worked out really well:

"alb.ingress.kubernetes.io/actions.forward-to-tg" = jsonencode({
  type           = "forward"
  targetGroupArn = data.aws_lb_target_group.web_server.arn
})
rule {
      host = local.tenant_hostname
      http {
        path {
          backend {
            service {
              name ="forward-to-tg"
              port {
                name = "use-annotation"
              }
            }
          }
          path = "/*"
        }
      }
    }

This worked as expected, LBC manages targets within the target group and automatically attaches it to the ALB it manages. This allows us to use a single ALB for hundreds of tenants - ALB scalability or single point of failure is not really an issue for us, as we've already been running a single ALB with 200+ tenants in Ec2-land and never encountered an issue (at least not on the ALB level :) ).

Thank you again for your support, hope this issue will help others too in the future. All the best!

andreybutenko commented 1 month ago

Awesome! I'm so glad the solution worked for you :) We'll close the issue, feel free to open another issue if you have other questions