unbit / uwsgi

uWSGI application server container
http://projects.unbit.it/uwsgi
Other
3.46k stars 691 forks source link

Graceful shutdown closes sockets too early #1795

Open Flid opened 6 years ago

Flid commented 6 years ago

I have an application with a heavy endpoint, taking 10 seconds to respond. If I shutdown the service (SIGHUP, or SIGTERM with --hook-master-start "unix_signal:15 gracefully_kill_them_all", or sending 'q' to master pipe) - the graceful shutdown sort of works, it really waits for the request to finish. The only problem - it closes client connection right after receiving the command, so I see curl: (52) Empty reply from server on client side. After that the worker finishes request processing, writes success message in logs and exits. So the log looks like that (with some comments):

[uWSGI] getting INI configuration from uwsgi.ini
*** Starting uWSGI 2.0.17 (64bit) on [Tue May 22 15:09:05 2018] ***
compiled with version: 5.4.0 20160609 on 22 May 2018 11:40:04
os: Linux-4.8.0-53-generic #56~16.04.1-Ubuntu SMP Tue May 16 01:18:56 UTC 2017
nodename: anton-pc
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /home/user/directory
detected binary path: /home/anton/.virtualenvs/appname/bin/uwsgi
your processes number limit is 126990
your memory page size is 4096 bytes
 *** WARNING: you have enabled harakiri without post buffering. Slow upload could be rejected on post-unbuffered webservers *** 
detected max file descriptor number: 200000
lock engine: pthread robust mutexes
thunder lock: enabled
uWSGI http bound on 0.0.0.0:7799 fd 4
uwsgi socket 0 bound to TCP address 127.0.0.1:42215 (port auto-assigned) fd 3
Python version: 3.6.3 (default, Oct  6 2017, 08:44:35)  [GCC 5.4.0 20160609]
Python main interpreter initialized at 0x9c7d10
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 218760 bytes (213 KB) for 2 cores
*** Operational MODE: preforking ***
spawned uWSGI master process (pid: 13591)
spawned uWSGI worker 1 (pid: 13603, cores: 1)
spawned uWSGI worker 2 (pid: 13604, cores: 1)
*** Stats server enabled on /tmp/stats.socket fd: 15 ***
spawned uWSGI http 1 (pid: 13605)
running "unix_signal:15 gracefully_kill_them_all" (master-start)...

<app logs:>
2018-05-22 15:09:08,438 Processing request GET /my-heavy-request. View function: heavy_stuff. 

Tue May 22 15:09:09 2018 - graceful shutdown triggered...
Gracefully killing worker 1 (pid: 13603)...
Gracefully killing worker 2 (pid: 13604)...
gateway "uWSGI http 1" has been buried (pid: 13605)
worker 2 buried after 1 seconds

<The client got error at this point already>

<After several seconds the successful response is logged:>
[pid: 24167|app: 0|req: 1/1] 127.0.0.1 () {28 vars in 324 bytes} [Tue May 22 15:22:39 2018] GET /my-heavy-request => generated 275 bytes in 10278 msecs (HTTP/1.1 200) 2 headers in 72 bytes (1 switches on core 0)
worker 1 buried after 10 seconds
goodbye to uWSGI.

I really can't explain this behaviour. Any ideas?

Config file:

master-fifo = /tmp/uwsgi.fifo
master=true
http = 0.0.0.0:7799
module = app:application
master = true
processes = 2
#gevent = 128
harakiri = 30
single-interpreter = true
enable-threads = true
reaper = true
thunder-lock = true
close-on-exec = true
close-on-exec2 = true
stats = /tmp/stats.socket

And I start it like that: uwsgi --ini uwsgi.ini --hook-master-start "unix_signal:15 gracefully_kill_them_all"

The application inside is python flask with gevent. I tried with and without gevent - nothing changes.

vershininm commented 6 years ago

Faced with the same issue. Python 2.7.12 + uwsgi 2.0.17 + Flask 0.10.1

vershininm commented 6 years ago

that's actually works fine with --socket= (not --http=) which is OK for us

TBoshoven commented 4 years ago

This is definitely still an issue. It seems like the http router gets buried before the workers finish shutting down. When I remove shutdown of the gateways from gracefully_kill_them_all, everything behaves as expected, so it definitely looks like this is the issue.

I think a sensible solution would be to delay the gateway shutdown until all workers are shut down during a graceful shutdown / restart.

I reproduced by making an application (serving requests using --http) that sleeps for a few seconds (shorter than harakiri) and using gracefully_kill_them_all while it is handling a request to shut it down. Without removing gateway shutdown, I get an empty result. When removing gateway shutdown, I get my result body.

I would love a way to fully drain the listen queue before shutting down, but it would be cool if we could finish in-flight requests at least.

aldem commented 4 years ago

Faced this as well recently. This behavior makes graceful shutdown kind of useless for http sockets.

james-tisato-kortical commented 4 years ago

We're also suffering from the same issue, using uWSGI 2.0.18. We'd really like to keep using the --http option but it does seem like graceful shutdown simply doesn't work in that case.

alexander-akhmetov commented 4 years ago

Maybe it will be useful to someone. Faced the same issue, and with --http graceful shutdown does not work, even if uwsgi waits for the workers, it stops the additional http process with its connections first. However, with http-socket (when uwsgi does not start the additional http process) and --hook-master-start "unix_signal:15 gracefully_kill_them_all" it works. And if you need HTTP keepalive you can use http11-socket.

https://uwsgi-docs.readthedocs.io/en/latest/ThingsToKnow.html

The http and http-socket options are entirely different beasts. The first one spawns an additional process forwarding requests to a series of workers (think about it as a form of shield, at the same level of apache or nginx), while the second one sets workers to natively speak the http protocol. TL/DR: if you plan to expose uWSGI directly to the public, use --http, if you want to proxy it behind a webserver speaking http with backends, use --http-socket.

diogobaeder commented 4 years ago

I'm having the same issue, but with --attach-daemon2; Does anybody know how to make that work in that scenario? unix_signal:15 gracefully_kill_them_all just doesn't respect my daemon at some point and brutally kills it...