renatomefi / php-fpm-healthcheck

A POSIX compliant sh script to healthcheck PHP fpm status, can be used only for pinging or check for specific metrics
MIT License
491 stars 55 forks source link

Metric check for listen queue len seems wrong #23

Closed snc closed 5 years ago

snc commented 5 years ago

Based on https://serverfault.com/a/355546 the listen queue len value is decreased on connection. So initially the value might be for example 128. When I run the check via php-fpm-healthcheck --listen-queue-len=10 it fails with 'listen queue len' value '128' is greater than expected '10'. In this case this should only fail if the value of listen queue len is lower than 10.

renatomefi commented 5 years ago

Hey @snc thanks for reporting, after reading the issue you pointed out and trying to remember how I coded it what I understand of the metric is:

I might have interpreted it wrongly, but would that make sense? Otherwise would you have a suggestion on how we can tweak this metric?

For now you could not use it since it might not be clear, if we find a conclusion we can document it better!

Thanks

achton commented 5 years ago

It seems to me that listen queue len is a static number? Like an upper limit for the listen queue value (which is the real-time status of the connection queue). In which case checking anything against that value doesn't make sense.

Which means that the example in the README - "check if you have more than 10 processes in the queue" - would make better sense if it was based on listen queue or max listen queue fields. Eg:

And you can also check if you have more than 10 processes in the queue:

$ php-fpm-healthcheck --accepted-conn=3000 --max-listen-queue=10
$ echo $?
0

Admittedly, I haven't dug into the php-fpm docs for the meanings of these values, but that is my reading of it.

renatomefi commented 5 years ago

It's very difficult to find information about it, even the code doesn't have much documentation either, one thing that I found says it's the amount of pending connection, check this article: https://easyengine.io/tutorials/php/fpm-status-page/

So as @achton said we could use the listen-queue metric instead in the documentation, since using the listen-queue-len might be useful only on very specific use cases!

If you find more material about it please let me know!

Also I found this very interesting blogpost on how to monitor phpfpm: https://www.malasuk.com/linux/php-fpm-status-page/

renatomefi commented 5 years ago

Hello everyone, have you seen the changes from #25?

It addresses this issue here, and has a nice update to the docs and explanation in the PR!

Thanks a lot @tjespers!