ansible-community / ara

ARA Records Ansible and makes it easier to understand and troubleshoot.
https://ara.recordsansible.org
GNU General Public License v3.0
1.85k stars 173 forks source link

Improvement in gunicorn container settings #322

Open hille721 opened 3 years ago

hille721 commented 3 years ago

What is the idea ?

I'm not sure if the current gunicorn settings in the official ara images are really optimized for a container usage:

--workers=4

Starting 4 workers, means 4 processes inside the container, which is a vertical scaling inside the container. But isn't using containers about horizontal scaling? Thus instead of spawn more processes in one container, we would use just more containers.

I found this nice guide: https://pythonspeed.com/articles/gunicorn-in-docker/ and also tried these recommend settings. With them I am able to spawn more containers each with less ressources. Which is in on my container platform (Openshift) much better.

The guide is from 2019 and I'm not a expert in that topic, but maybe here are some who can jump into the discussion :)

dmsimard commented 3 years ago

Hi and thanks for the issue !

To be fair, I must say that there are no claims that the container images published by the project are intended or optimized for production use at a large scale in the docs:

The scripts are designed to yield images that are opinionated and “batteries-included” for the sake of simplicity. They install the necessary packages for connecting to MySQL and PostgreSQL databases and set up gunicorn as the application server.

You are encouraged to use these scripts as a base example that you can build, tweak and improve the container image according to your specific needs and preferences.

For example, precious megabytes can be saved by installing only the things you need and you can change the application server as well as it’s configuration.

That is not to say that we cannot improve the base image we publish but the objective is more about getting people started quickly and then allowing users to tweak on their own by showing them how the sausage is made.

That said, it wouldn't be a bad idea to benchmark different approaches and settings to find out what works best and what doesn't so we can make an informed decision. I personally like gunicorn but there's also uwsgi and other ways to run the application if people really want to.

Edit: links to existing benchmarks:

VannTen commented 3 hours ago

It looks like gunicorn and containers don't go very well together

We're currently POCing with ara on kubernetes to record our playbooks runs, using the images provided, and consistently getting WORKER TIMEOUT errors (doing simple curls call with not much data, using sqlite for now (as we're just trying ara))


127.0.0.1 - - [10/Oct/2024:13:14:00 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:03 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:10 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:11 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:13 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:14 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:14:23 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
[2024-10-10 13:24:16 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:65)
[2024-10-10 13:24:17 +0000] [1] [ERROR] Worker (pid:65) was sent SIGKILL! Perhaps out of memory?
[2024-10-10 13:24:17 +0000] [103] [INFO] Booting worker with pid: 103
[2024-10-10 13:24:52 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:67)
[2024-10-10 13:24:53 +0000] [1] [ERROR] Worker (pid:67) was sent SIGKILL! Perhaps out of memory?
[2024-10-10 13:24:53 +0000] [104] [INFO] Booting worker with pid: 104
[2024-10-10 13:25:27 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:101)
[2024-10-10 13:25:28 +0000] [1] [ERROR] Worker (pid:101) was sent SIGKILL! Perhaps out of memory?
[2024-10-10 13:25:28 +0000] [105] [INFO] Booting worker with pid: 105
127.0.0.1 - - [10/Oct/2024:13:25:57 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
127.0.0.1 - - [10/Oct/2024:13:29:19 +0000] "GET / HTTP/1.1" 200 231491 "-" "curl/8.10.1"
hille721 commented 2 hours ago

What do you have for gunicorn settings? I have following:

gunicorn --workers=2 --threads=4 --worker-class=gthread --worker-tmp-dir /dev/shm --log-file=- --access-logfile=- --bind 0.0.0.0:8000 ara.server.wsgi

with that ara is running since years on kubernetes