Open dilipkraghupatruni opened 8 months ago
NLB takes lot more time to register new targets compared to ALB. Because of the current logic in aws load balancer controller, the old pods are deregistered first and then load balancer controller tries to register the new pods. However, the lag between these two steps is significant enough with NLB to cause app impact. The order of the operation is same with ALB but ALB does the operation very fast compared to NLB and hence we dont notice any impact.
Just wanted to clarify that the amount of time between when deregistration executes and when registration executes is negligible. The deregistration and registration events are essentially occurring at the same time (probably just a few milliseconds between these events).
The problem is that the old targets/pods enter a deregistering
state in the NLB Target Group at the same time as when the new targets/pods enter an initial
state in the NLB Target Group and NLB Target Group registration takes between 3 to 5 minutes (see this comment). This means that for 3 to 5 minutes there are 0 healthy
targets in the NLB Target Group.
Regarding ALBs, I've observed during my testing that ALB Target Group registration typically takes between 10-15 seconds. So during a blue-green deploy with an ALB there is a 10-15 second period of time when there are 0 healthy
targets in the ALB Target Group.
Having a quantity of time to wait after target registration and before target deregistration as mentioned by @dilipkraghupatruni should help to maintain healthy
targets in the NLB/ALB Target Group during a blue-green flip.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The problem here is it's hard for the controller to decide whether a old target(that no longer is a valid target) shall be removed. In your case, the target shall be kept for 2~4 mins(for NLB), while in most other cases, such targets shall be deregistered asap(as they may belong to other applications or already died).
The problem here is it's hard for the controller to decide whether a old target(that no longer is a valid target) shall be removed. In your case, the target shall be kept for 2~4 mins(for NLB), while in most other cases, such targets shall be deregistered asap(as they may belong to other applications or already died).
@M00nF1sh : part of @dilipkraghupatruni 's suggestion was to add config to control this behavior. That would take the decision making out of the controller's hands. Rather, it would do the operations in the order specified in the config and at the intervals specified in the config.
any update on this? targets in initial or deregistration phase won't receive new connections
This is an interesting problem I haven't thought about before. Usually users use deployments which handle rollouts of new pods according to deployment's spec.strategy
for example using maxSurge
and maxUnavailable
, so there is always a registered and healthy pod to serve traffic. Is there no way of Argo using this approach?
Regarding time it takes for NLB to register a target this has improved a lot and we see NLB targets going healthy in about 90s now (including initial health check with 3 times 10s health check), see here
When using aws-load-balancer-controller in conjunction with a standard K8s Deployment with a rolling upgrade strategy, the pods are typically gradually replaced from old to new, thus causing some number of pods to always be registered in the AWS LB target group.
The problem in this issue is really specifically related to the way Argo Rollouts works with the Blue Green deployment strategy. When using the Blue Green deployment strategy the Argo Rollouts controller will update the active K8s service selector at the point in time when a new application version is promoted to active. At this point in time all old pods are deregistered and all new pods are registered in the AWS LB target group. Since the deregistration and registration occur at the same time, you end up with a period of time where there are 0 healthy targets registered in the target group, which results in 503 errors occurring for incoming requests.
To solve this problem with the compatibility of Argo Rollouts with a load balancer managed by aws-load-balancer-controller I suggest taking a look at the Ping Pong deployment strategy in Argo Rollouts. This strategy is technically included in the canary
spec, but you can configure the canary steps in a manner that works as either a traditional Blue Green deployment (i.e. move traffic from old pods to new pods in a single instant) or as a tradtional Canary deployment (i.e. gradually increase the number of new pods handling incoming traffic).
The Ping Pong deployment strategy in Argo Rollouts will make use of ALB traffic weights to cause traffic to shift from one application version (replicaset) to another. You can achieve a blue/green deployment using the Ping Pong deployment strategy by using a set of canary steps which promote the canary to active in a single step like this:
strategy:
canary:
pingPong:
pingService: ping-svc
pongService: pong-svc
steps:
- setWeight: 0
- setCanaryScale:
weight: 100
- pause: {}
trafficRouting:
alb:
ingress: active-ingress
rootService: root-svc
servicePort: 443
I believe using the Ping Pong deployment strategy resolves the original reason why @dilipkraghupatruni opened this issue.
Thank you for sharing this @jwenz723. So we at least have a solution for ALB fronted services. Not yet for NLB fronted services because NLB does not support weighted target groups and this was the initial concern from @dilipkraghupatruni
Is your feature request related to a problem? We are using Argo rollouts for application deployments to EKS.
During blue-green switch, we are noticing issues related to AWS Load Balancer Controller when used with NLB. NLB takes lot more time to register new targets compared to ALB. Because of the current logic in aws load balancer controller, the old pods are deregistered first and then load balancer controller tries to register the new pods. However, the lag between these two steps is significant enough with NLB to cause app impact. The order of the operation is same with ALB but ALB does the operation very fast compared to NLB and hence we dont notice any impact.
Describe the solution you'd like We would like to get a feature toggle/ config in load balancer controller to swap the operations which means we want to register new pods first and then deregister old pods. https://github.com/kubernetes-sigs/aws-load-balancer-controller/blob/main/pkg/targetgroupbinding/resource_manager.go#L143-L152 And also configure the amount of time we want to wait between the operations and also max amount of time we want to wait on deregistration.
Describe alternatives you've considered We have implemented canary deployment strategy instead of blue green and that helped us reduce the impact. But as part of the debugging, we figured out the problem is not specific to NLB. We deal with thousands of applications where majority use blue-green deployment strategy. hence we cannot implement canary for everyone. We would like to fix the problem at its core which is inside aws load balancer controller.