Closed petertgiles closed 3 months ago
I think my recommendation would be to increase pm.max_children to 20 and not change request_terminate_timeout. If we want we can probably test these on the fly by SSH'ing into the app service, changing them, then restarting FPM.
Do lot of data in Dev with faker and try it out with the above suggestions. Assessment Step tracker , firing of notifications can be tested in dev for this.
In Docker you can restart FPM with this command: pkill -o -USR2 php-fpm
Essentially, send the "USR2" signal to the oldest process called php-fpm, which is the master process. If you run ps ax
before and after you'll see the master process PID stays the same while the four workers are new.
I've never tried this in Azure, though.
Hmm, someone should probably document this stuff? 😬
To check the FPM status while logged into the server: curl 127.0.0.1:8080/fpm-status
Not available outside of server.
So I calculated average memory per process and I got 45 MB
https://serverfault.com/questions/863238/check-an-average-memory-usage-by-single-php-fpm-process
Total available memory on Dev : 1215 Required fpm workers = 1215/45 = 27
Recommended by Peter : 20
I don't mind going upto 20 and increase if we need more.
### pm.max_children = 10
This is ApacheBench, Version 2.3 <$Revision: 1903618 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests
Server Software: nginx/1.24.0 Server Hostname: localhost Server Port: 8000
Document Path: / Document Length: 4989 bytes
Concurrency Level: 20 Time taken for tests: 6.503 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 52490000 bytes HTML transferred: 49890000 bytes Requests per second: 1537.83 [#/sec] (mean) Time per request: 13.005 [ms] (mean) Time per request: 0.650 [ms] (mean, across all concurrent requests) Transfer rate: 7882.87 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.2 0 10 Processing: 3 13 7.3 11 106 Waiting: 3 13 7.3 11 106 Total: 3 13 7.4 11 107
Percentage of the requests served within a certain time (ms) 50% 11 66% 14 75% 16 80% 17 90% 21 95% 26 98% 32 99% 37 100% 107 (longest request)
### pm.max_children = 20
This is ApacheBench, Version 2.3 <$Revision: 1903618 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests
Server Software: nginx/1.24.0 Server Hostname: localhost Server Port: 8000
Document Path: / Document Length: 4989 bytes
Concurrency Level: 20 Time taken for tests: 7.635 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 52490000 bytes HTML transferred: 49890000 bytes Requests per second: 1309.70 [#/sec] (mean) Time per request: 15.271 [ms] (mean) Time per request: 0.764 [ms] (mean, across all concurrent requests) Transfer rate: 6713.48 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 25 Processing: 3 15 45.5 11 1031 Waiting: 3 15 45.5 11 1031 Total: 3 15 45.5 12 1031
Percentage of the requests served within a certain time (ms) 50% 12 66% 14 75% 16 80% 18 90% 22 95% 26 98% 31 99% 35 100% 1031 (longest request)
Results Comparison
_With pm.maxchildren = 10 Time taken for tests: 6.503 seconds Requests per second: 1537.83 [#/sec] (mean) Time per request: 13.005 [ms] (mean) Transfer rate: 7882.87 [Kbytes/sec] Connection Times (ms)
50%: 11 ms 90%: 21 ms 99%: 37 ms
_With pm.maxchildren = 20 Time taken for tests: 7.635 seconds Requests per second: 1309.70 [#/sec] (mean) Time per request: 15.271 [ms] (mean) Transfer rate: 6713.48 [Kbytes/sec]
Result isn't very magical when just increasing fpm workers from 10 to 20. Let me play around with other settings as well
_pm.max_children = 17 pm.start_servers = 3 pm.min_spare_servers = 2 pm.max_spareservers = 4
This is ApacheBench, Version 2.3 <$Revision: 1903618 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests Completed 8000 requests Completed 9000 requests Completed 10000 requests Finished 10000 requests
Server Software: nginx/1.24.0 Server Hostname: localhost Server Port: 8000
Document Path: / Document Length: 4989 bytes
Concurrency Level: 20 Time taken for tests: 7.600 seconds Complete requests: 10000 Failed requests: 0 Total transferred: 52490000 bytes HTML transferred: 49890000 bytes Requests per second: 1315.74 [#/sec] (mean) Time per request: 15.201 [ms] (mean) Time per request: 0.760 [ms] (mean, across all concurrent requests) Transfer rate: 6744.44 [Kbytes/sec] received
Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.4 0 22 Processing: 3 15 8.5 13 92 Waiting: 3 15 8.5 13 91 Total: 3 15 8.5 13 92
Percentage of the requests served within a certain time (ms) 50% 13 66% 16 75% 19 80% 21 90% 26 95% 31 98% 39 99% 45 100% 92 (longest request)
♻️ Debt/Refactor
We have an FPM configuration in our deployment to specify the number of FPM workers, worker life time, and watchdog timer cutoffs. Lately we've been running out of FPM workers pretty regularly causing the site to become unresponsive. We're working to improve the efficiency of our site but lets boost the work capacity in the meantime.
🙋♀️ Proposed Solution
Increase pm.max_children from 10 to X Increase request_terminate_timeout from 60s to Y
✅ Acceptance Criteria