Choosing a number of workers in a distributed system scenario

andrii-korotkov-verkada commented 8 months ago

Hey. I hope your day is going well. I've seen a recommendation to have a number of workers to be 2 * cores + 1. But in a setting of a distributed system with Kubernetes deployments with customizable number of requested cores this becomes trickier to choose. There are choices between larger pods with more workers but less of them vs smaller pods with less workers but more of them. Some examples of configurations include:

1 requested cpu core per container and 2 or 3 workers.
1 requested cpu core per container and 1 worker.
0.5 requested cpu cores per container with 1 worker.
X requested cpu cores per container with 2 * X + 1 workers.
X workers and tuned requested cpu based on the actual load.

Choice with 1 worker offers most flexibility in rightsizing the number of pods, but also may have a bit more overhead due to having a master process. Also, due to re-creation of worker after max requests there can be some downtime. Choice with many workers avoids some of the problems above, but also only allows to scale in bigger units and can lead to overprovisioning in regions where there's little traffic (like cpu utilization can be low even with min replicas set to 3 for availability reasons).

What's the best choice here? Thank you.

andrii-korotkov-verkada commented 8 months ago

I've ended up with an approach to use 2 workers and tune the cpu requests as appropriate.

benoitc commented 5 months ago

i don't really see the point there. Consider your container or pod as a single webserver instance. Then what matters is rather the location of this container to ensure you will be resilient across your system. One instance per web app. That the easiest schema.

benoitc / gunicorn

Choosing a number of workers in a distributed system scenario #3159