Open batCoder95 opened 5 years ago
That is something that can be solved by adding configuration to your deployment where you gracefully shut down a pod. The script only seems to update the deployment configuration to decrease the number of replicas, and the rest is taken care of by the Kubernetes API itself. Here is some documentation about it: https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods
What we do in this situation is make sure our pods that scale up and down have support for handling signals. In some instances we simple exit and assume SQS will resend the message after the "invisible" time period elapses. In other cases (e.g. long running tasks) we push the message to another queue that allows the message to be picked up by a twin process near real time in the graceful shutdown.
I have a minimum 2 replicas of my pod running at a time. If the SQS message count exceeds 2, let's say it becomes 5, the autoscaler should spin up 3 more replicas, in order to serve the excess 3 messages. This part (scale-up) is working fine.
Now once there are 5 replicas running, the autoscaler sometimes kills the pods that is actively running a process on the sqs message.
Please help. Thanks in advance.