php / php-src

The PHP Interpreter
https://www.php.net
Other
37.93k stars 7.72k forks source link

php-fpm ping or status is to many CPU usage #12600

Open wirwolf opened 10 months ago

wirwolf commented 10 months ago

Description

Hello, community. I am using php-fpm 8.2 in the docker container and running on the Kubernetes cluster. And after some time of observations, I see very strange behavior of my horizontal pod autoscaler. All the time, application is scaled to maximal pod count although there is no load or traffic at all. After investigating fpm logs I see that the status endpoint sometimes uses too match CPU.

- - 03/Nov/2023:06:52:50 +0000 "GET /status" 200 - 0.148 2048 6756.76%
- - 03/Nov/2023:06:52:49 +0000 "GET /status" 200 - 0.323 2048 6191.95%
- - 03/Nov/2023:06:52:45 +0000 "GET /status" 200 - 0.161 2048 6211.18%
- - 03/Nov/2023:06:52:45 +0000 "GET /status" 200 - 0.061 2048 16393.44%
- - 03/Nov/2023:06:52:34 +0000 "GET /status" 200 - 0.325 2048 3076.92%
- - 03/Nov/2023:06:52:18 +0000 "GET /status" 200 - 0.201 2048 4975.12%
- - 03/Nov/2023:06:52:16 +0000 "GET /status" 200 - 4.299 2048 232.61%
- - 03/Nov/2023:06:52:07 +0000 "GET /status" 200 - 0.116 2048 8620.69%
- - 03/Nov/2023:06:51:57 +0000 "GET /status" 200 - 0.178 2048 5617.98%
- - 03/Nov/2023:06:51:50 +0000 "GET /status" 200 - 0.242 2048 4132.23%
- - 03/Nov/2023:06:51:37 +0000 "GET /status" 200 - 0.170 2048 5882.35%
- - 03/Nov/2023:06:51:30 +0000 "GET /status" 200 - 0.277 2048 3610.11%
- - 03/Nov/2023:06:51:29 +0000 "GET /status" 200 - 0.156 2048 6410.26%
- - 03/Nov/2023:06:51:25 +0000 "GET /status" 200 - 0.131 2048 7633.59%
- - 03/Nov/2023:06:51:01 +0000 "GET /status" 200 - 0.154 2048 6493.51%
- - 03/Nov/2023:06:51:00 +0000 "GET /status" 200 - 0.223 2048 4484.30%
- - 03/Nov/2023:06:50:55 +0000 "GET /status" 200 - 0.216 2048 4629.63%
- - 03/Nov/2023:06:50:53 +0000 "GET /status" 200 - 0.192 2048 5208.33%
- - 03/Nov/2023:06:50:39 +0000 "GET /status" 200 - 0.258 2048 3875.97%
- - 03/Nov/2023:06:50:33 +0000 "GET /status" 200 - 0.138 2048 7246.38%
- - 03/Nov/2023:06:50:27 +0000 "GET /status" 200 - 0.161 2048 6211.18%
- - 03/Nov/2023:06:50:18 +0000 "GET /status" 200 - 0.179 2048 11173.18%

And this is the reason why autoscaler detects an unusual load all the time. I also saw mentions of this problem in the comments to the bug: https://bugs.php.net/bug.php?id=80428 Please fix this strange phantom error.

Best regards

PHP Version

8.2

Operating System

alpine 3.12

bukka commented 10 months ago

IIRC this CPU usage number in access logs is more relative to the whole usage so if you don't have almost any traffic, then it might be obviously high but can't really say without more info.

So to further investigate I will need a bit more details about your setup. Mainly following

Ideally if you could provide a helm chart or similar with everything in it so I can easily reproduce it, that would be perfect.

wirwolf commented 10 months ago

Hello @bukka. I create a test project with an example docker image and k8s resources https://github.com/wirwolf/php-bug-12600 How to use:

bukka commented 10 months ago

From a quick look I think 60m limit on CPU might be just too low for FPM.

Considering that this has been always like this, it's not exactly a bug. We should look in to optimizing this so I will add this to my TODO list to look at later but will change it to the feature request.

bukka commented 10 months ago

I have got actually already note about integrating a proper health check similar to php-fpm-healthcheck (which is what you also use in your setup I see) which I think would be an optimal solution because using status is not really ideal.

wirwolf commented 10 months ago

Also, I use a status endpoint for grab metrics end is exported to cluster for hpa(for example scale when fpm uses max child) What solution do you suggest in my case?