nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.41k stars 331 forks source link

"Error 503." PHP 7.0 | Unit 1.15 and Unit 1.17 #488

Open Evgeniy-Bondarenko opened 4 years ago

Evgeniy-Bondarenko commented 4 years ago

3

Happens randomly.

Sometimes it is enough to refresh the page. Occurs both in busy time and in idle time. In Unit log - not have detail or error (debug mode not try)

`

        "api": {
            "type": "php",
            "root": "/www/",
            "script": "index.php",
            "processes": {
                "max": 100,
                "spare": 20,
                "idle_timeout": 20
            }
        }
VBart commented 4 years ago

Can you reproduce the issue on Unit 1.20.0? 1.15 and 1.17 are obsolete versions.

Evgeniy-Bondarenko commented 4 years ago

on 1.20 version (same config) - have stable 503 (as on screenshot)

Evgeniy-Bondarenko commented 4 years ago

Change one string in config

from "spare": 20,

to

"spare": 1,

and worked!!! Have stable answer http-code: 200

Change only spare from 20 -> 1.


Tech: Server 24cores\258G Ram and ssd disk (now idle time) Worked in doсker 18.06 and Openshift3\haproxy 1.6 Builded from sources (github tag 1.15\1.17\1.20) on base FROM debian images

VBart commented 4 years ago

Could you try with debug log enabled? If you have installed Unit from our repositories, then you already have a debug build in the system and you just need to run it instead of a regular one: https://unit.nginx.org/troubleshooting/#installing-from-our-repos

I guess the problem is caused by some limits in your system (probably number of descriptors, or so), that prevents running multiple processes. If course, there are must be relevant errors in the log, but maybe we are missing something.

I've tried on my system with your processes settings and everything works fine, no 503 errors. But I use this simple script:

% cat phpinfo.php 
<?php phpinfo(); ?>

Could you also try with such simple script? If your app connects to database, then the problem may also be related to the database connections limits.

Evgeniy-Bondarenko commented 4 years ago

Phpinfo https://gist.github.com/Evgeniy-Bondarenko/d79ea8af86b3687586bc3947b8850580

successful connections executed ~10sec and havehttp-status:200 - spare 1 (max 100) wrong connections executed ~2-4sec and havehttp-status:503 - spare 20 (max 100)

on Mysql have limit 3000 max connections, but conections not used.

изображение

About Unit-Debug: Now I have no way to enable debug ( but I'll add a unit-degub-log later)

Tech: same php code same php-version same unit version same unit config (only spare string) repeats on different versions unit (1.15\1.17\1.20)

VBart commented 4 years ago

Unfortunately, this issue seems specific to the application. I can't reproduce it with any example php scripts. It just works fine with spare 20 and max 100 under testing load producing ~134000 rps without errors.

Only debug log in this case will help to identify the root cause of 503 errors.

Evgeniy-Bondarenko commented 4 years ago

Debug log link

VBart commented 4 years ago

The log by the link doesn't contain debug lines from the application process (pid 118 in this case). This is either because it's incomplete, or because the module was built without debug enabled.

Evgeniy-Bondarenko commented 4 years ago

Add full log (5k string)

VBart commented 4 years ago

So, the root cause of 503 is that the application process was killed:

2020/11/03 18:08:41.453 [alert] 1#1 process 128 exited on signal 9

The processing of this request has started even before the full log begins, as a result there's no other lines relevant to this process or this request.

Seems that PHP interpreter was executed, it worked quite long and then it was killed. Could you check the dmesg output as well? This can be caused by OOM killer due to out of memory condition.

misaon commented 3 years ago

@VBart we have same issue. You're right. This is actually caused by an OOM kill. The load on Nginx unit processes exceeds the container limit and the request fails with error 503.

PHP 7.4.13 (cli) (built: Dec 21 2020 12:51:42) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
    with Zend OPcache v7.4.13, Copyright (c), by Zend Technologies
unit version: 1.21.0
configured as ./configure --prefix=/usr/local/nginx-unit --openssl
{
  "settings": {
    "http": {
      "max_body_size": 20971520
    }
  },

  "listeners": {
    "*:80": {
      "pass": "routes/symfony"
    }
  },

  "routes": {
    "symfony": [
      {
        "match": {
          "uri": [
            "/media/cache/resolve/*"
          ]
        },

        "action": {
          "pass": "applications/symfony/index"
        }
      },

      {
        "match": {
          "uri": [
            "*.php",
            "*.php/*"
          ]
        },

        "action": {
          "pass": "applications/symfony/direct"
        }
      },

      {
        "action": {
          "share": "/app/public/",
          "fallback": {
            "pass": "applications/symfony/index"
          }
        }
      }
    ]
  },

  "applications": {
    "symfony": {
      "type": "php",
      "user": "www-data",
      "group": "www-data",
      "processes": {
        "max": 15,
        "spare": 5,
        "idle_timeout": 180
      },
      "targets": {
        "direct": {
          "root": "/app/public/"
        },

        "index": {
          "root": "/app/public/",
          "script": "index.php"
        }
      }
    }
  }
}
VBart commented 3 years ago

This is actually caused by an OOM kill. The load on Nginx unit processes exceeds the container limit and the request fails with error 503.

What do you expect from the Unit side? In this case you should either increase the memory limit or reduce the memory consumption by the PHP interpreter, or lower the number of PHP processes. Note the settings, that may be also relevant: https://www.php.net/manual/en/ini.core.php#ini.memory-limit

misaon commented 3 years ago

@VBart We set the processes to a "very large" value that exceeded the container limits. Therefore signal 9 in unitd log and error 503 in browser.

Evgeniy-Bondarenko commented 3 years ago

I have https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory Limit Ram (on pod/conteiner).

PHP limit in php.ini * Unit_config_processes_max != Kubernetes Container Limits -> Have Error 503

Need:

  1. reduce process in unit config (for not reaching host/conteiner/other limits in Ram)
    "applications": {
       "processes": {
          "max": 15 <—
  2. need calculate all recurse, e.z. container_limits = unit_process * php_memory_limits
  3. Check error on balancing service (nginx or HA-Proxy - have self error, very-very similar in appearance) - settions size_body, timeout_delay, etc)

@VBart We set the processes to a "very large" value that exceeded the container limits. Therefore signal 9 in unitd log and error 503 in browser.

ac000 commented 2 years ago

So, if I'm reading the above correctly, this issue was caused by container resource limits being hit?

tippexs commented 1 year ago

Yes looks like. The question is @ac000 is there something we would like to fix or can fix? As the limits can be configured but if configured incorrectly it would cause issues. So, would like to keep the issue open and have a chat about it next week 👍🏼

sxpert commented 1 year ago

hope I'm not hijacking the tread or anything. I have something similar in a python / pyramid application (currently on 1.24-python3.9) It has this long request to generate some datafile from extracting stuff from the database. Initially I had everything going on in memory, so I suspected that, replaced with writing to a tempfile, to no avail. graphana says I'm ok memory / disk wise. I get the crash (aka ending with signal 9) only when running a large data dump. The queries for which there is not much data appear to be ok.