Closed andrii-korotkov-verkada closed 5 months ago
I've ended up with an approach to use 2 workers and tune the cpu requests as appropriate.
i don't really see the point there. Consider your container or pod as a single webserver instance. Then what matters is rather the location of this container to ensure you will be resilient across your system. One instance per web app. That the easiest schema.
Hey. I hope your day is going well. I've seen a recommendation to have a number of workers to be 2 * cores + 1. But in a setting of a distributed system with Kubernetes deployments with customizable number of requested cores this becomes trickier to choose. There are choices between larger pods with more workers but less of them vs smaller pods with less workers but more of them. Some examples of configurations include:
Choice with 1 worker offers most flexibility in rightsizing the number of pods, but also may have a bit more overhead due to having a master process. Also, due to re-creation of worker after max requests there can be some downtime. Choice with many workers avoids some of the problems above, but also only allows to scale in bigger units and can lead to overprovisioning in regions where there's little traffic (like cpu utilization can be low even with min replicas set to 3 for availability reasons).
What's the best choice here? Thank you.