nginx / unit

NGINX Unit - universal web app server - a lightweight and versatile open source server that simplifies the application stack by natively executing application code across eight different programming language runtimes.
https://unit.nginx.org
Apache License 2.0
5.37k stars 323 forks source link

Unit crashed & Nginx error #633

Closed lacosteque closed 2 years ago

lacosteque commented 2 years ago

Versions:

The test environment is running wordpress with nginx & nginx unit. The load on the host, create only search bots. There is no other traffic.

But I noticed that Unit crashes after some time and Nginx gives 504 Gateway Time-out.

Here is an example of morning log at the moment when Unit crashed.

2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 83 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 89 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 105 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 41 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 100 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 84 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 85 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 99 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 71 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 101 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 80 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 95 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 102 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 82 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 96 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 86 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 72 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 87 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 98 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 78 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 79 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 88 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 91 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 93 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 97 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 90 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 81 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 94 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 103 remove failed
2022/01/20 09:18:23 [warn] 41#41 port #0 for pid 92 remove failed
2022/01/20 09:18:23 [info] 109#41 "unit" application started
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 09:18:23 [alert] 40#1 pthread_mutex_lock() failed (22: Invalid argument)
2022/01/20 09:18:23 [alert] 104#36 [unit] too many port socket messages
2022/01/20 09:18:23 [alert] 1#1 process 40 exited on signal 11
2022/01/20 09:18:23 [alert] 1#1 sendmsg(14, -1, -1, 1) failed (32: Broken pipe)
2022/01/20 09:18:23 [alert] 1#1 sendmsg(10, -1, -1, 1) failed (32: Broken pipe)

The only thing that helps is to completely restart the containers via podman-compose up

unit.conf

{
"settings": {
"http": {
"header_read_timeout": 30,
"body_read_timeout": 30,
"send_timeout": 30,
"idle_timeout": 180,
"max_body_size": 16777216
}
},

"listeners": {
"10.88.2.2:8081": {
"pass": "routes/unit",
"client_ip": {
"header": "X-Forwarded-For",
"recursive": true,
"source": [
"10.88.2.2"
]
}
}
},

"routes": {
"unit": [
{
"match": {
"uri": [
"*.php",
"*.php/*",
"/wp-admin/"
]
},

"action": {
"pass": "applications/unit/direct"
}
},
{
"action": {
"share": "/var/www/unit$uri",
"fallback": {
"pass": "applications/unit/index"
}
}
}
]
},

"applications": {
"unit": {
"type": "php",
"user": "unit",
"group": "unit",
"processes": {
"max": 30,
"spare": 1
},

"isolation": {
"namespaces": {
"cgroup": true,
"credential": true,
"mount": true,
"network": false,
"pid": true,
"uname": true
}
},

"options": {
"file": "/usr/local/etc/php/php.ini-production",
"admin": {
"memory_limit": "256M",
"variables_order": "EGPCS",
"expose_php": "0",
"log_errors": "On",
},

"user": {
"display_errors": "1"
}
},

"environment": {
"DB_NAME": "",
"DB_USER": "",
"DB_PASSWORD": "",
"DB_HOST": "",
"URL": ""
},

"targets": {
"direct": {
"root": "/var/www/unit/"
},

"index": {
"root": "/var/www/unit/",
"script": "index.php"
}
}
}
}
}

New unit.log

2022/01/20 19:48:30 [info] 134#62 "domen.com" application started
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 134#62 [unit] too many port socket messages
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (111: Connection refused)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (104: Connection reset by peer)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (107: Transport endpoint is not connected)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (107: Transport endpoint is not connected)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(17, 0, 8, 2) failed (9: Bad file descriptor)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (107: Transport endpoint is not connected)
2022/01/20 19:48:30 [alert] 62#1 sendmsg(12, 0, 8, 2) failed (107: Transport endpoint is not connected)
unit: "domen.com" prototype: ../nptl/pthread_mutex_lock.c:428: __pthread_mutex_lock_full: Assertion `e != ESRCH || !robust' failed.
2022/01/20 19:48:30 [alert] 1#1 process 62 exited on signal 11
2022/01/20 19:48:30 [alert] 1#1 sendmsg(10, -1, -1, 1) failed (107: Transport endpoint is not connected)
2022/01/20 19:48:30 [alert] 1#1 sendmsg(15, -1, -1, 2) failed (88: Socket operation on non-socket)
2022/01/20 19:48:30 [alert] 1#1 sendmsg(15, -1, -1, 2) failed (88: Socket operation on non-socket)
2022/01/20 19:48:30 [alert] 13#13 sendmsg(57, -1, -1, 2) failed (88: Socket operation on non-socket)

nginx.log

10.88.2.3 - - [20/Jan/2022:20:19:18 +0300]  499 "POST /wp-admin/admin-ajax.php HTTP/2.0" 0 "https://domen.com/wp-admin/admin.php?page=file_manager_advanced_ui" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
10.88.2.3 - - [20/Jan/2022:20:21:18 +0300]  499 "POST /wp-admin/admin-ajax.php HTTP/2.0" 0 "https://domen.com/wp-admin/admin.php?page=file_manager_advanced_ui" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
10.88.2.3 - - [20/Jan/2022:20:25:23 +0300]  499 "POST /wp-admin/admin-ajax.php HTTP/2.0" 0 "https://domen.com/wp-admin/admin.php?page=file_manager_advanced_ui" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
10.88.2.3 - - [20/Jan/2022:20:26:24 +0300]  504 "GET /wp-admin/admin.php?page=file_manager_advanced_ui HTTP/2.0" 562 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
2022/01/20 20:26:24 [error] 34#34: *240 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.88.2.3, server: localhost, request: "GET /wp-admin/admin.php?page=file_manager_advanced_ui HTTP/2.0", upstream: "http://10.88.2.2:8081/wp-admin/admin.php?page=file_manager_advanced_ui", host: "*.*.*.*"
10.88.2.3 - - [20/Jan/2022:20:26:24 +0300]  499 "POST /wp-admin/admin-ajax.php HTTP/2.0" 0 "https://domen.com/wp-admin/admin.php?page=file_manager_advanced_ui" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
10.88.2.3 - - [20/Jan/2022:20:26:33 +0300]  504 "GET /ip.php HTTP/2.0" 562 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36" "-"
2022/01/20 20:26:33 [error] 34#34: *240 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.88.2.3, server: localhost, request: "GET /ip.php HTTP/2.0", upstream: "http://10.88.2.2:8081/ip.php", host: "*.*.*.*"

Perhaps you need to configure the config more correctly? And of course I do not particularly like such a number of warnings and alert, I would like to see, only, info messages in logs :)

Thank you!

mar0x commented 2 years ago

Hello, Please share complete unit.log.

lacosteque commented 2 years ago

Hi The error did not happen again, if it does I will give you the full log. But at the moment I left docker and just deployed it on the server, maybe the errors were because of docker. At the moment Unit works without errors and does not crash.

lacosteque commented 2 years ago

Hi

My log is flooded with the following messages

2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 68427 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 91709 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 166042 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 82073 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 187702 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 493825 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 84002 remove failed
2022/02/16 22:53:45 [warn] 350#350 port #0 for pid 393899 remove failed

What is the problem? How do I fix it?

The unit itself works without errors.

Installed on a server paired with nginx without containers.

mar0x commented 2 years ago

Warnings are one of the symptoms. When isolated PID number reach prototype process host PID, the prototype crashed.

The workaround - disable PID isolation. The patch - attached gh633_proto_isolation.patch.gz.

lacosteque commented 2 years ago

Removed from the configuration.

"Isolation." {
            { "namespaces": {
                  { "cgroup": true,
                  { "credential": true,
                  { "mount": true,
                  { "network": false }
                  { "uname": true
    }
        },

now the log is full

2022/02/17 20:17:35 [info] 158027#158027 "newbon.ru" application started 294194 2022/02/17 20:17:51 [notice] 156992#156992 app process (isolated 158027) exited with code 0 2022/02/17 20:18:05 [info] 158063#158063 "newbon.ru" application started 2022/02/17 20:18:20 [notice] 156992#156992 app process (isolated 157957) exited with code 0 2022/02/17 20:18:43 [info] 158158#158158 "newbon.ru" application started 2022/02/17 20:18:54 [notice] 156990#156990 app process (isolated 158137) exited with code 0 2022/02/17 20:18:59 [notice] 156992#156992 app process (isolated 158063) exited with code 0 2022/02/17 20:19:11 [info] 158206#158206 "newbon.ru" application started 2022/02/17 20:19:26 [notice] 156992#156992 app process (isolated 158158) exited with code 0

Is it possible to install the patch, only when building the package or can it be installed on the installed application? Thank you!

mar0x commented 2 years ago

It is required to rebuild main Unit binary (unitd) to apply the patch.

lacosteque commented 2 years ago

It is required to rebuild main Unit binary (unitd) to apply the patch.

it's sad )) Can a unit be used with this error? The fact that the log grows, I will survive.