Open saley89 opened 8 years ago
Hm... the way Python implements signals makes it impossible to use them in this situation (C event-loop based server), because they are put into in an internal signal queue and the handlers are only called the next time the Python interpreter is executed, which is when a new request is being made to the server or an existing request is being processed.
I'm not sure how other event-loop based servers solve this problem, maybe by periodically invoking the Python interpreter.
Maybe you can do aside with signals altogether?
This seems to be possible to work around with periodically calling into https://docs.python.org/2/c-api/exceptions.html#c.PyErr_CheckSignals
Thanks for the quick reply and confirming our suspicions that it may be related to the C side of the library. We will look into an alternative solution.
It would be great if you could build in the CheckSignals code you mentioned above if that would allow us to handle signals correctly. I can leave this open if you would like to look into it, otherwise I am happy to close the issue.
Shouldn't SIGTERM be handled the same way SIGINT is ? For graceful shutdown ?
That's correct, if it doesn't work like that then that's a separate issue.
@jonashaag SIGTERM is not handled like SIGINT as https://github.com/jonashaag/bjoern/blob/master/bjoern/server.c#L76 only handles SIGINT.
A follow-up question:
https://github.com/jonashaag/bjoern/blob/master/bjoern/server.c#L91 places a SIGINT on the signal queue, would
PyErr_SetInterrupt(); PyErr_CheckSignals();
ensure that the signal is propagated as soon as it occurs ?
I don't know, maybe the other way round (first check signals, then set interrupt), but in either case we'd have to make sure we hold the GIL
Are there any work arounds? I have an app that should not have any downtime.
Maybe you can run this as a subprocess and kill it whenever required, I guess
To gracefully shut down services, systemd
sends a SIGTERM
followed 90 secs later by a SIGKILL
if necessary (see here). Due to this issue there is no easy way to capture SIGTERM
when using bjoern
. I find that meinheld
just returns from the main event loop on any interrupting signal so I don't need a signal handler at all. Perhaps that approach could be considered?
So I have will have to move back to meinheld
at least temporarily although my benchmarks show that bjoern
performs slightly faster.
Can you guys try the https://github.com/jonashaag/bjoern/tree/signal-watcher branch
Sorry, but I really don't want to create a signal handler (it is asynchronous and there are plenty of spin-off issues, e.g. when using multiprocess
the children inherit the handler) and was trying to use one simply to catch systemd
terminating my service. I suspect this is probably also the case for the OP @saley89 here.
Now that I have seen that meinheld
simply returns from the run()
event loop for the SIGTERM
case it makes a lot of sense to me so I will stick with that. As I said, perhaps you could consider that approach also for bjoern
for better systemd
compatibility.
@bulletmark The patch in that branch should make this work. It simply checks every 100ms if there's a new signal in the Python signal queue and executes that Python signal handler.
The https://github.com/jonashaag/bjoern/tree/signal-watcher branch works perfectly for me with SIGTERM.
My use case is a on_sigterm
python handler which do some "cleaning things" then send a SIGINT
to the same process to force bjoern
to exit.
I still think that simply returning from the event loop as I describe in my 2 posts above (and how meinheld does it) is a better way to go, and most compatible with systemd termination. @jonashaag, please reconsider that approach.
I just released 3.0.0 with the approach taken in the signal-watcher branch.
@bulletmark could you elaborate on what's better in the meinheld implementation for your use case?
@bulletmark @jonashaag a complete solution could be to add to the bjoern package a kind of bjoern_wrapper
which could be used as a "bjoern cli" (launched by systemd, supervisor, circus or other process managers). Included features would be:
importlib
)I use something like that internally. With a little bit of cleaning, I could propose this as a PR if your are interested.
I use something like that internally. With a little bit of cleaning, I could propose this as a PR if your are interested.
Definitely! Could also paste it here uncleaned so we can discuss without having you spend time on it
internal script hardly cleaned (but not completely), not even tested, just here to discuss
i think we need to add :
importlib
) for some events (on_sigterm, before_start...) like in gunicorn
#!/usr/bin/env python3
import argparse
import sys
import bjoern
import importlib
import signal
import datetime
import time
import os
import threading
from mflog import get_logger
from mfutil import kill_process_and_children
LOGGER = get_logger("bjoern_wrapper")
def send_sigint(pid):
LOGGER.info("sending SIGINT to %i", pid)
os.kill(pid, signal.SIGINT)
def on_signal(signum, frame):
if signum == 15:
LOGGER.info("received SIGTERM => smart closing...")
# [...]
# Real stop
send_sigint(os.getpid())
def get_wsgi_application(path):
if len(path.split(':')) != 2:
LOGGER.warning("main_arg must follow module.submodule:func_name")
sys.exit(1)
module_path, func_name = path.split(':')
mod = importlib.import_module(module_path)
try:
return getattr(mod, func_name)
except Exception:
LOGGER.warning("can't find: %s func_name in module: %s" % (
func_name, mod))
sys.exit(1)
class TimeoutWsgiMiddleware(object):
def __init__(self, app, timeout):
self.app = app
self.timeout = timeout
self.started = None
if self.timeout > 0:
x = threading.Thread(target=self.timeout_handler, daemon=True)
x.start()
def timeout_handler(self):
now = datetime.datetime.now
while True:
if self.started:
if (now() - self.started).total_seconds() > self.timeout:
LOGGER.warning("Request Timeout => exit")
# Self-Kill
kill_process_and_children(os.getpid())
time.sleep(1)
def __call__(self, environ, start_response):
if self.timeout > 0:
self.started = datetime.datetime.now()
try:
# see http://blog.dscpl.com.au/2012/10/
# obligations-for-calling-close-on.html
iterable = self.app(environ, start_response)
for data in iterable:
yield data
finally:
self.started = None
if hasattr(iterable, 'close'):
iterable.close()
def main():
parser = argparse.ArgumentParser(description="bjoern wrapper")
parser.add_argument("main_arg", help="wsgi application path")
parser.add_argument("unix_socket", help="unix socket to listen path")
parser.add_argument("--timeout", default=60, type=int,
help="one request execution timeout (in seconds)")
args = parser.parse_args()
signal.signal(signal.SIGTERM, on_signal)
wsgi_app = get_wsgi_application(args.main_arg)
try:
os.unlink(args.unix_socket)
except Exception:
pass
try:
app = TimeoutWsgiMiddleware(wsgi_app, args.timeout)
bjoern.run(app, 'unix:%s' % args.unix_socket)
except KeyboardInterrupt:
LOGGER.info("process shutdown")
if __name__ == '__main__':
main()
@jonashaag, what I am talking about is so simple it is embarrassing that I apparently have not explained it clearly enough in my first 2 posts here! :)
Normally code blocks essentially forever in run()
and all I am suggesting is that run()
should simply return if a SIGTERM
is received. SIGTERM
is the standard way systemd shuts down a program. No need for the user to write any signal handlers (with the associated complications that often entails as I mention above).
Meinheld does it this way and it works well.
@bulletmark I understand what you are saying ;-) But IMHO, today, bjoern
is a kind of library and not a "full featured daemon". So, you can also want to get "full control" of the stop process (particularly if you provide a python socket(-like) object by yourself).
So we have two different use cases:
bjoern
as a regular daemon with SIGTERM standard features)I propose to add to existing bjoern
a wrapper to have both use cases.
With the wrapper, systemd will launch your app with:
bjoern_cli --port=8080 --timeout=60 yourmod.yoursubmod:app
and you will get a regular SIGTERM
feature (without anything to implement)
What do you think about this compromise ?
I have a python program which just calls bjoern's run()
. Actually it does that via bottles run(..., server='bjoern')
which is even more trivial as I can just change server='meinheld'
etc. After the run()
command I call some cleanup stuff which I need to do before termination. How would I do that with your proposal?
My plan:
bjoern_cli --port=8080 --timeout=60 --hook-after-stop=yourmod.after_stop yourmod:app
with "user provided" hook:
def after_stop(wsgi_app, **kwargs)
# [...] clean what you want
return True
I plan to implement other hooks:
before_start(wsgi_app, **kwargs)
(if return False
the cli exit)on_signal(wsgi_app, signum, frame, **kwargs)
(if return False
is returned, a SIGINT
is sent to bjoern
to exit the main loop), with of course a default/regular SIGTERM handlingThat would work of course but nowhere near as simple or intuitive as just returning from run()
when terminating.
Stupid question, if I send sigterm/sigint to a Python program, it will raise KeyboardInterrupt, right? So shouldn't that be what happens when you try to interrupt bjoern as well?
With this code:
import time
try:
time.sleep(60)
except KeyboardInterrupt:
print("catched KeyboardInterrupt")
except:
print("catched something else")
SIGINT
, you get: catched KeyboardInterrupt
(with return code 0
)SIGTERM
, you get an uncatched Terminated
(with return code 143
)From #145
So, here's how Gunicorn works:
I agree that we should not throw away the backlog when doing a graceful shutdown.
Also I think it's fine to just stop bjoern without any kind of graceful shutdown for SIGINT (but not for SIGTERM).
Bjoern currently works as follows:
So, going forward, my suggestion is we change Bjoern behaviour as follows:
We can make the SIGTERM behaviour configurable, and allow for custom shutdown behaviours, by exporting some shutdown primitives from the C code.
I'm still not sure how to meet both the expectations of people thinking of bjoern as a "library" that doesn't fiddle with signal handling and people thinking of bjoern as a server that takes care of graceful loop shutdown.
Maybe we actually have to implement something along the lines of what @thefab suggested.
I'm not sure it's a typo or not but you wrote:
"SIGTERM waits for active connections to finish, and the backlog to be empty (while still accepting new connections), with a timeout, and exits the program."
IMHO, after a SIGTERM, we don't accept new connections !
For my "library use case", it's perfectly ok to have another signal for custom shutdown. All I need is the python signal handler called from C like in 3.0.
is it a correct way to gracefully exit?
import bjoern
import sys
import signal
import os
def sigterm_handler(_signo, _stack_frame):
sys.stdout.write(' [*] SIGTERM: Stop \n')
os.kill(os.getpid(), signal.SIGINT)
if __name__ == '__main__':
signal.signal(signal.SIGTERM, sigterm_handler)
sys.stdout.write(' [*] Starting bjoern on port 5000...! \n')
try:
bjoern.run(app, '0.0.0.0', 5000)
except KeyboardInterrupt:
sys.stdout.write(' [*] SIGINT: Stop \n')
finally:
sys.stdout.write(' [*] Exiting \n')
I'm not sure it's a typo or not but you wrote:
"SIGTERM waits for active connections to finish, and the backlog to be empty (while still accepting new connections), with a timeout, and exits the program."
IMHO, after a SIGTERM, we don't accept new connections !
Yup. I would agree. Without the use of an external coordinator (to route new connections elsewhere), the server might potentially never have a chance to gracefully shut down, cos it'll keep accepting new connections.
For my "library use case", it's perfectly ok to have another signal for custom shutdown. All I need is the python signal handler called from C like in 3.0.
This "library vs server" thing is interesting. Would the following work for a simple compromise? This supposes that you have the ability to modify (and why shouldn't you?) whatever mechanism you're using to send the SIGTERM to bjoern:
This setup would work for keeping the server up without any interruption in serving connections, and also allow for graceful connection draining. Does this make sense?
Here are my two cents...
What if in most cases we use a systemd
manager for our services it would be graceful to add a systemd-socket into bjoern out-of-the-box. In this case, all socket issues with opening or closing are gone.
Offtopic: I was confused a little bit. If you run bjoern with unix socket it works as wsgi server, otherwise, it works as http. Is it possible to manage via params? I would like to have the ability to run wsgi server which binds to tcp socket.
It always works as a WSGI server (which is a special kind of HTTP server)
It always works as a WSGI server (which is a special kind of HTTP server)
Hmm....
This is my wsgi.py
import os
from app.core import create_app
app = create_app()
if __name__ == '__main__':
#app.run(debug=True)
import bjoern
port = os.getenv('BIND_PORT', '8000')
if not port or not port.isdigit():
raise RuntimeError('Socket for bind not found.')
bjoern.run(app, host='localhost', port=int(port))
this is a part of nginx config
location ~ ^/(?<app>\w+)/(?<app_uri>.*) {
include uwsgi_params;
uwsgi_param SCRIPT_NAME /$app;
uwsgi_param PATH_INFO /$app_uri;
uwsgi_pass localhost:8000;
#access_log /var/log/nginx/app/$app.log;
error_log /var/log/nginx/app/errors.log warn;
}
And I'm getting a 400 error code. Any ideas why?
// It works if use proxy_pass
instead of uwsgi_pass
but SCRIPT_NAME
in this case is missed.
As above, I left bjoern and went to meinheld given this issue but am now back to bjoern for other reasons. My problem here was that bjoern does not return from run()
when systemd sends the process a SIGTERM which is the default manner by which systemd terminates an application. However, I have now found this is a trivial issue because all I had to do was define KillSignal=SIGINT
in the systemd service file and that works as I want because bjoern run()
does return on SIGINT. Just noting this here for the benefit of others using systemd to manage their application, and want a graceful way to catch termination.
Hi folks, inspire by @bulletmark, I do something like this. (GracefulKiller from StackOverflow) A few things need to be handled during the graceful shutdown.
#!/usr/bin/env python
import os
from flask import Flask
import bjoern
import signal
import time
import sys
app = Flask(__name__)
@app.route("/")
def hello_world():
time.sleep(30)
return "<p>Hello, World!</p>"
class GracefulKiller:
kill_now = False
def __init__(self):
# not handle sigint, since the bjoern will raise it.
# signal.signal(signal.SIGINT, self.exit_gracefully)
signal.signal(signal.SIGTERM, self.exit_gracefully)
def exit_gracefully(self, *args):
sock, wsgi_app = bjoern._default_instance
# Once the sock close, the request still in buffer will be close, and get the error 'Connection reset by peer'
# The request already in process will be handle completed.
sock.close()
print('you can close the background jobs here, or the in main KeyboardInterrupt exception')
os.kill(os.getpid(), signal.SIGINT)
if __name__ == '__main__':
GracefulKiller()
try:
bjoern.run(app, '0.0.0.0', 8080)
except KeyboardInterrupt:
print('If you have some background jobs with multithreading, you can handle it here')
print('start process graceful shutdown')
time.sleep(3)
print('Ready to exit')
sys.exit(0)
Greetings. In my env (python 3.8, bjoern 3.1.0) I test script by @ally-ben. Bjoern stops after sending SIGTERM only after starting so far it has not accepted any request. But if I send to app on bjoern one or more any requests, it's not responding to SIGTERM.
Hi,
I have a simple Python script wrapping a Bjoern web server around a WSGI app. In essence, it looks like this:
The server works just fine. The problem is that when sending a SIGTERM signal (e.g. using
kill -15
), the signal handler doesn't get called and the process doesn't die. When sending SIGKILL, the process does die, without calling the signal handler (this is expected).How can I intercept SIGTERM?