Open Evgeniy-Bondarenko opened 4 years ago
Can you reproduce the issue on Unit 1.20.0? 1.15 and 1.17 are obsolete versions.
on 1.20 version (same config) - have stable 503 (as on screenshot)
Change one string in config
from
"spare": 20,
to
"spare": 1,
and worked!!!
Have stable answer http-code: 200
Change only spare
from 20 -> 1.
Tech:
Server 24cores\258G Ram and ssd disk (now idle time)
Worked in doсker 18.06 and Openshift3\haproxy 1.6
Builded from sources (github tag 1.15\1.17\1.20) on base FROM debian
images
Could you try with debug log enabled? If you have installed Unit from our repositories, then you already have a debug build in the system and you just need to run it instead of a regular one: https://unit.nginx.org/troubleshooting/#installing-from-our-repos
I guess the problem is caused by some limits in your system (probably number of descriptors, or so), that prevents running multiple processes. If course, there are must be relevant errors in the log, but maybe we are missing something.
I've tried on my system with your processes settings and everything works fine, no 503 errors. But I use this simple script:
% cat phpinfo.php
<?php phpinfo(); ?>
Could you also try with such simple script? If your app connects to database, then the problem may also be related to the database connections limits.
Phpinfo https://gist.github.com/Evgeniy-Bondarenko/d79ea8af86b3687586bc3947b8850580
successful connections executed ~10sec and havehttp-status:200
- spare 1 (max 100)
wrong connections executed ~2-4sec and havehttp-status:503
- spare 20 (max 100)
on Mysql have limit 3000 max connections, but conections not used.
About Unit-Debug: Now I have no way to enable debug ( but I'll add a unit-degub-log later)
Tech: same php code same php-version same unit version same unit config (only spare string) repeats on different versions unit (1.15\1.17\1.20)
Unfortunately, this issue seems specific to the application. I can't reproduce it with any example php scripts. It just works fine with spare 20 and max 100 under testing load producing ~134000 rps without errors.
Only debug log in this case will help to identify the root cause of 503 errors.
Debug log link
The log by the link doesn't contain debug lines from the application process (pid 118 in this case). This is either because it's incomplete, or because the module was built without debug enabled.
Add full log (5k string)
So, the root cause of 503 is that the application process was killed:
2020/11/03 18:08:41.453 [alert] 1#1 process 128 exited on signal 9
The processing of this request has started even before the full log begins, as a result there's no other lines relevant to this process or this request.
Seems that PHP interpreter was executed, it worked quite long and then it was killed. Could you check the dmesg
output as well? This can be caused by OOM killer due to out of memory condition.
@VBart we have same issue. You're right. This is actually caused by an OOM kill. The load on Nginx unit processes exceeds the container limit and the request fails with error 503.
PHP 7.4.13 (cli) (built: Dec 21 2020 12:51:42) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Zend OPcache v7.4.13, Copyright (c), by Zend Technologies
unit version: 1.21.0
configured as ./configure --prefix=/usr/local/nginx-unit --openssl
{
"settings": {
"http": {
"max_body_size": 20971520
}
},
"listeners": {
"*:80": {
"pass": "routes/symfony"
}
},
"routes": {
"symfony": [
{
"match": {
"uri": [
"/media/cache/resolve/*"
]
},
"action": {
"pass": "applications/symfony/index"
}
},
{
"match": {
"uri": [
"*.php",
"*.php/*"
]
},
"action": {
"pass": "applications/symfony/direct"
}
},
{
"action": {
"share": "/app/public/",
"fallback": {
"pass": "applications/symfony/index"
}
}
}
]
},
"applications": {
"symfony": {
"type": "php",
"user": "www-data",
"group": "www-data",
"processes": {
"max": 15,
"spare": 5,
"idle_timeout": 180
},
"targets": {
"direct": {
"root": "/app/public/"
},
"index": {
"root": "/app/public/",
"script": "index.php"
}
}
}
}
}
This is actually caused by an OOM kill. The load on Nginx unit processes exceeds the container limit and the request fails with error 503.
What do you expect from the Unit side? In this case you should either increase the memory limit or reduce the memory consumption by the PHP interpreter, or lower the number of PHP processes. Note the settings, that may be also relevant: https://www.php.net/manual/en/ini.core.php#ini.memory-limit
@VBart We set the processes to a "very large" value that exceeded the container limits. Therefore signal 9 in unitd log and error 503 in browser.
I have https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory Limit Ram (on pod/conteiner).
PHP limit in php.ini * Unit_config_processes_max != Kubernetes Container Limits -> Have Error 503
Need:
"applications": {
"processes": {
"max": 15 <—
container_limits = unit_process * php_memory_limits
@VBart We set the processes to a "very large" value that exceeded the container limits. Therefore signal 9 in unitd log and error 503 in browser.
So, if I'm reading the above correctly, this issue was caused by container resource limits being hit?
Yes looks like. The question is @ac000 is there something we would like to fix or can fix? As the limits can be configured but if configured incorrectly it would cause issues. So, would like to keep the issue open and have a chat about it next week 👍🏼
hope I'm not hijacking the tread or anything. I have something similar in a python / pyramid application (currently on 1.24-python3.9) It has this long request to generate some datafile from extracting stuff from the database. Initially I had everything going on in memory, so I suspected that, replaced with writing to a tempfile, to no avail. graphana says I'm ok memory / disk wise. I get the crash (aka ending with signal 9) only when running a large data dump. The queries for which there is not much data appear to be ok.
Happens randomly.
Sometimes it is enough to refresh the page. Occurs both in busy time and in idle time. In Unit log - not have detail or error (debug mode not try)
`