heroku / heroku-buildpack-python

Heroku's buildpack for Python applications.
https://www.heroku.com/python
MIT License
974 stars 1.84k forks source link

Improve `WEB_CONCURRENCY` support #1547

Closed edmorley closed 5 months ago

edmorley commented 6 months ago

The Python buildpack, like several of the other languages buildpacks has for some time automatically set the WEB_CONCURRENCY environment variable at dyno boot (if it's not already set), based on the size of the Heroku dyno. The env var is then used by some Python web servers (such as Gunicorn and Uvicorn) to control the default number of server processes that they launch.

When the original Python buildpack implementation for this was written many years ago, there was not a way to determine dyno available memory vs the host memory (since /sys/fs/cgroup/memory/memory.limit_in_bytes did not exist). As such, the existing implementation relied upon a hardcoded mapping of known process limit values to dyno sizes: https://devcenter.heroku.com/articles/limits#processes-threads

This mapping was fragile, since if the process limits ever changed in the future, or if new dyno types were added, then WEB_CONCURRENCY would not get set, or be set to an incorrect value.

In addition, the existing choice of concurrency values for Performance dynos (and their Private Space equivalents) was suboptimal, since on Performance-L dynos concurrency defaulted to 11, which is only 1.4 times the Performance-M's default concurrency of 8, even though the former has 4 times the number of CPU cores and 5.6 times the RAM.

As such, the buildpack now instead dynamically calculates the value for WEB_CONCURRENCY based on the dyno's actual specifications, by setting it to the lowest of either <dyno available RAM in MB> / 256 or <number of dyno CPU cores> * 2 + 1. The former ensures each web server process has at least 256 MB RAM available to reduce the chance of OOM, and the latter is based upon benchmarking and the Gunicorn worker guidance here: https://docs.gunicorn.org/en/latest/design.html#how-many-workers

This new implementation results in the following default concurrency values for each Heroku dyno size:

To increase awareness of the change in defaults, and to make the buildpack's existing automatic configuration of WEB_CONCURRENCY less of a black box, the .profile.d/ script now also prints memory/CPU/concurrency information to the app's logs (for web dynos only, to avoid breaking scripting use-cases).

For example:

app[web.1]: Python buildpack: Detected 14336 MB available memory and 8 CPU cores.
app[web.1]: Python buildpack: Defaulting WEB_CONCURRENCY to 17 based on the number of CPU cores.

If your app is relying on the buildpack-set WEB_CONCURRENCY value, and you do not wish to use the new default concurrency values, then you can switch back to the previous values (or whatever value performs best in benchmarks of your app), by either:

Lastly, integration tests have been added for the buildpack's .profile.d/ scripts, since there were none before.

See: https://devcenter.heroku.com/articles/config-vars https://devcenter.heroku.com/articles/python-gunicorn https://docs.gunicorn.org/en/latest/settings.html#workers https://www.uvicorn.org/#command-line-options

GUS-W-14623334. GUS-W-15109094. GUS-W-15109115. GUS-W-15131932.

edmorley commented 6 months ago

Note: This change will be a no-op for most non-Heroku usages of the buildpack (such as Dokku), since in those environments WEB_CONCURRENCY was not being set before (due to lack of the custom process limits used by the hardcoded mapping), and still won't be now (since the buildpack skips configuring concurrency unless the /sys/fs/cgroup/memory/memory.limit_in_bytes file exists and contains a non-bogus value, which is only the case on Heroku, or when using cgroups v1 via older Docker with an explicit --memory limit set).

In the future we may choose to add support for cgroups v2's memory file too, however, enabling automatic WEB_CONCURRENCY in those non-Heroku environments would be a bigger breaking change, so has been deferred for now.

edmorley commented 5 months ago

Updated the defaults listed at: https://devcenter.heroku.com/articles/python-concurrency#default-settings-and-behavior

Published changelog entry: https://devcenter.heroku.com/changelog-items/2846