bakins / php-fpm-exporter

Prometheus exporter for php-fpm status.
MIT License
203 stars 46 forks source link

Cannot get metrics from fully saturated php-fpm server #28

Open PaulFarver opened 5 years ago

PaulFarver commented 5 years ago

The history

We would ideally scale our php-fpm layer based on number of active compared to idle fpm child processes. We have a setup, with relatively small php containers, to better utilize our server resources. We have 2Gi memory allocated to each container, and have the process manager set to static with 20 child processes for better performance and a more reliable memory usage. This problem might be more prevalent for us, since our pods are so small, but is theoretically a problem for all sizes.

The problem

Unfortunately, rendering the phpfpm status page requires a child process, which means that a php container with all children active, cannot be scraped. Since the exporter relies on the fpm status page for it's metrics, it ends up returning no metrics, when it doesn't get anything from php-fpm.

It is possible to run with a dynamic or ondemand process manager, but specifying pm.max_children would result in the same problem, and not specifying pm.max_children would run the risk of getting your php container oomkilled.

This effectively renders the exporter unusable for us, since we cant rely on the metrics, especially during high load.

The solution

We are open to the fact, that we are not aware of all the ways to configure the php-fpm process manager, and welcome all ideas of how to get around this issue. That said, we think the only way to get rid of the problem would be to find a way to gather metrics in a more reliable way, either by waiting for a child process to become available, or somehow gather the metrics without having an fpm child render them.

bakins commented 5 years ago

I'm unsure how to get metrics from php-fpm except bu using the status page. If someone does know a way, it could be added to this project. We could retry the status page a few times with a timeout in the meantime.