Open hynek opened 7 years ago
Hi, this is the expected behaviour with threads + fork (and it is one of the main reasons why golang does not support it ;) Basically the thread is spawned in the master, but the wait happens in the worker for a thread that is not 'runnable'. --lazy-apps should fix your problem. If you need copy on write, spawn the thread in a post_fork hook
It does, thanks for the super quick answer! I wish I hadn’t spent so much time on the SSCE. 🙈
I feel like I need to add for posterity, that for the atexits actually to run on SIGTERM & SIGINT, you need to set --hook-master-start "unix_signal:15 gracefully_kill_them_all" --hook-master-start "unix_signal:2 gracefully_kill_them_all"
(master FIFO is not a good option in containers)
Um so I have another question:
Why on earth would uWSGI behave differently when I kill it with Ctrl-C and when I kill it with kill -2 <master pid>
?
kill -2 <master pid>
:
$ uwsgi --master --http-socket=127.0.0.1:8000 --module "mywsgi:make_app()" --lazy-apps --hook-master-start "unix_signal:2 gracefully_kill_them_all" --enable-threads
*** Starting uWSGI 2.0.15 (64bit) on [Mon Aug 14 16:21:00 2017] ***
compiled with version: 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42) on 21 July 2017 12:40:07
os: Darwin-16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64
nodename: alpha.local
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /Users/hynek/Projects/ssce
detected binary path: /Users/hynek/.virtualenvs/ssce/bin/uwsgi
your processes number limit is 709
your memory page size is 4096 bytes
detected max file descriptor number: 7168
lock engine: OSX spinlocks
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 127.0.0.1:8000 fd 3
Python version: 3.6.1 (default, May 4 2017, 15:25:00) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Python main interpreter initialized at 0x7fa223002c00
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 145504 bytes (142 KB) for 1 cores
*** Operational MODE: single process ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 12800)
spawned uWSGI worker 1 (pid: 12801, cores: 1)
running "unix_signal:2 gracefully_kill_them_all" (master-start)...
thread started
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7fa223002c00 pid: 12801 (default app)
Mon Aug 14 16:21:12 2017 - graceful shutdown triggered...
Gracefully killing worker 1 (pid: 12801)...
start clean up
msg sent
thread exiting
end clean up
worker 1 buried after 1 seconds
goodbye to uWSGI.
Ctrl-C:
$ uwsgi --master --http-socket=127.0.0.1:8000 --module "mywsgi:make_app()" --lazy-apps --hook-master-start "unix_signal:2 gracefully_kill_them_all" --enable-threads
*** Starting uWSGI 2.0.15 (64bit) on [Mon Aug 14 16:21:33 2017] ***
compiled with version: 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42) on 21 July 2017 12:40:07
os: Darwin-16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64
nodename: alpha.local
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /Users/hynek/Projects/ssce
detected binary path: /Users/hynek/.virtualenvs/ssce/bin/uwsgi
your processes number limit is 709
your memory page size is 4096 bytes
detected max file descriptor number: 7168
lock engine: OSX spinlocks
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 127.0.0.1:8000 fd 3
Python version: 3.6.1 (default, May 4 2017, 15:25:00) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Python main interpreter initialized at 0x7f992e002c00
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 145504 bytes (142 KB) for 1 cores
*** Operational MODE: single process ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 12855)
spawned uWSGI worker 1 (pid: 12856, cores: 1)
running "unix_signal:2 gracefully_kill_them_all" (master-start)...
thread started
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7f992e002c00 pid: 12856 (default app)
^CMon Aug 14 16:21:35 2017 - graceful shutdown triggered...
Gracefully killing worker 1 (pid: 12856)...
worker 1 buried after 1 seconds
goodbye to uWSGI.
How does it even know that the signal is coming from my keyboard?
And then if I run it with --threads 2
it gets even more bizarre: kill -2
will trigger the graceful shutdown, however it hangs…until I press Ctrl-C:
$ uwsgi --master --http-socket=127.0.0.1:8000 --module "mywsgi:make_app()" --lazy-apps --hook-master-start "unix_signal:2 gracefully_kill_them_all" --threads 2
*** Starting uWSGI 2.0.15 (64bit) on [Mon Aug 14 16:19:40 2017] ***
compiled with version: 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42) on 21 July 2017 12:40:07
os: Darwin-16.7.0 Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/RELEASE_X86_64
nodename: alpha.local
machine: x86_64
clock source: unix
pcre jit disabled
detected number of CPU cores: 8
current working directory: /Users/hynek/Projects/ssce
detected binary path: /Users/hynek/.virtualenvs/ssce/bin/uwsgi
your processes number limit is 709
your memory page size is 4096 bytes
detected max file descriptor number: 7168
lock engine: OSX spinlocks
thunder lock: disabled (you can enable it with --thunder-lock)
uwsgi socket 0 bound to TCP address 127.0.0.1:8000 fd 3
Python version: 3.6.1 (default, May 4 2017, 15:25:00) [GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)]
Python main interpreter initialized at 0x7f8ce9827400
python threads support enabled
your server socket listen backlog is limited to 100 connections
your mercy for graceful operations on workers is 60 seconds
mapped 166080 bytes (162 KB) for 2 cores
*** Operational MODE: threaded ***
*** uWSGI is running in multiple interpreter mode ***
spawned uWSGI master process (pid: 12739)
spawned uWSGI worker 1 (pid: 12740, cores: 2)
running "unix_signal:2 gracefully_kill_them_all" (master-start)...
thread started
WSGI app 0 (mountpoint='') ready in 0 seconds on interpreter 0x7f8ce9827400 pid: 12740 (default app)
Mon Aug 14 16:19:52 2017 - graceful shutdown triggered...
Gracefully killing worker 1 (pid: 12740)...
^Cstart clean up
msg sent
thread exiting
end clean up
worker 1 buried after 29 seconds
goodbye to uWSGI.
Could you shed some light here what is going on? Is it even possible to run WSGI application with a side-car thread?
Can you try on a Linux system ? I know it may look foolish but it is not the first time signals + pthreads result in weird behaviours on osx :(
In the mean time i can confirm that pthread cancelation does not work as expected on OSX. So you should add the option --no-threads-wait if you plan to spawn multiple threads on darwin. Basically with this option the worker is killed without waiting for threads cooperation.
OK Linux time! It’s an Ubuntu Trusty VM running in VMWare.
--enable-threads
+ kill -2
Works!
--enable-threads
+ Ctrl-CWorks!
--threads 2
+ kill -2
Seems to work but I think I had it fail occasionally. Not 100% sure tho.
--threads 2
+ Ctrl-CSo it apparently starts the cleanup but the join fails because it needs a second Ctrl-C.
To me, this looks awfully like some kind of race condition (who’d thought with threads :D)? Especially because very rarely it works too.
I’ve also tried sprinkle in a bunch of print(threading.enumerate())
that indicate that my thread isn't running anymore and instead there's [<_MainThread(b'uWSGIWorker1Core0', stopped 140602576090944)>, <_DummyThread(b'uWSGIWorker1Core1', started daemon 140602515154688)>]
. _DummyThread
!?
I don’t know if that helps you at all…here's the SSCE again if you wanna run it yourself.
Setting deamon to False does also some really funky stuff but let’s focus on one thing. ;)
Does using --no-threads-wait improves the situation ? (both in Linux and OSX)
Shutdown works but atexit
is not executed.
Shutdown starts (msg sent
) but hangs there, the thread doesn't seem to exist in cleanup anymore – given the prints.
Oh, this is interesting, so on the mac ctrl-c has some magic behaviour that needs to be tackled :) but on Linux there is something more strange happening as it looks like the worker is blocked in the t.join() part. Does it happen when hitting ctrl-c, when sending -2 to the master or in both cases ?
So all Linux now:
→ the thread isn’t there and it hangs
kill -2
→ The thread is there and it works.
I gotta run now, so I suggest you play with my SSCE I’ve posted before, it has no dependencies and works on both Python 2 and Python 3. :)
JFTR, I remembered why Ctrl-C and kill -2 work differently: doing a Ctrl-C fires a SIGINT to all foreground processes. That seems to confuse things.
I recently working on a nginx+uwsgi+flask+golang web service. Before i add some golang shared library for python ,it all work fine.But if i call golang shared library in my python script,it stuck without any error, just no response from golang.(it works fine if i call golang shared library with python interpreter) after i add lazy-apps and --no-threads-wait , the stucking problem fixed!!! I think it maybe some problem with gorotuine,but i not sure the main reason. Can anyone explain this,or maybe give me some keyword that i can try to find
Something odd happens if you try to join a thread in an atexit handler.
Consider the following code:
All it does is starting a thread in the background that waits for an event to be set.
If you run it with
uwsgi --enable-threads --http-socket=127.0.0.1:8000 --module "mywsgi:make_app()"
everything works as espected:However if you run it with a master process, it hangs while waiting for the thread which never does receive it:
I guess threading is disabled too soon or something? This is relevant to me because I need to clean up my background thread in prometheus_async.