bottlepy / bottle

bottle.py is a fast and simple micro-framework for python web-applications.
http://bottlepy.org/
MIT License
8.44k stars 1.46k forks source link

Reloader option. Some ideas #1094

Open python2and3developer opened 6 years ago

python2and3developer commented 6 years ago

I had some issues using reloader option. I am still trying to discover what happens. I have been reading the bottle.py code and other codes for reloading like these ones:

https://github.com/pallets/werkzeug/blob/master/werkzeug/_reloader.py https://github.com/django/django/blob/master/django/utils/autoreload.py https://github.com/lepture/python-livereload/blob/master/livereload/watcher.py https://github.com/gorakhargosh/watchdog

One potential drawback that I see in the current implementation is that is watching all the python modules, including the modules in the standard library.

One possible idea is to filter the modules in the standard library:

for name, module in sys.modules.items():
    if not module or name in sys.builtin_module_names or not hasattr(module, '__file__'):
        # an import sentinel, built-in module or not a real module, really
        continue

    path = module.__file__
    if path[-4:] in ('.pyo', '.pyc'): path = path[:-1]
    if path and exists(path): files[path] = mtime(path)

Source: https://stackoverflow.com/questions/22195382/how-to-check-if-a-module-library-package-is-part-of-the-python-standard-library

Another possible idea is to watch changes in a directory, like here: https://github.com/lepture/python-livereload/blob/master/livereload/watcher.py

One possible improvement that is using django is to create a set of loaded modules, instead of a list, like this:

for module in set(sys.modules.values()):

Bottle is doing this:

for module in list(sys.modules.values()):

Another possible improvement is to use watchdog or pyinotify if they are already installed. Otherwise to use the current implementation as a fallback.

Django is providing a customized solution just in case pyinotify is not installed. Otherwise it uses pyinotify.

Werkzeug is using a similar implementation to the current solution of bottle.py and it provides also a solution using watchdog: https://github.com/pallets/werkzeug/blob/master/werkzeug/_reloader.py

Werkzeug also does this extra check:

    for module in list(sys.modules.values()):
        if module is None:
            continue
        filename = getattr(module, '__file__', None)
        if filename:
            if os.path.isdir(filename) and \
               os.path.exists(os.path.join(filename, "__init__.py")):
                filename = os.path.join(filename, "__init__.py")
defnull commented 6 years ago

Is there a tangible problem that needs to be solved? I do not see any noticeable delay from watching all modules, instead of just the user provided. It's unnecessary work, but as long as it runs fast enough and does not impact the application performance, I'd prefer a simple and short implementation over a more sophisticated one.

python2and3developer commented 6 years ago

I will try to discover what is the problem and I will report with more details. The possibility to use watchdog or pyinotify if they are installed and to use the current implementation as a fallback like django or Werkzeug is doing is also interesting because then the code is relying in third party well tested modules specialized to this purpose.

python2and3developer commented 6 years ago

Hi! Maybe this could be the problem. The method interrupt_main() only raise KeyboardInterrupt in the main thread,as the docs say:

thread.interrupt_main()
Raise a KeyboardInterrupt exception in the main thread. A subthread can use this function to interrupt the main thread.

https://docs.python.org/2/library/thread.html

What would it happen if the code of some server catch the exception unintentionally? The thread checking changes in files raise and exception in the main thread and exits. The main thread catch the exception, maybe logging that some exception happened but it continues its executions (and the process doesn't exit). Now the web server continue serving pages but there is no thread checking changes in files because that thread exited before, and no new process was created.

For some reason, maybe this is happening to me:

Exception happened during processing of request from ('127.0.0.1', 39254)
Traceback (most recent call last):
  File "/usr/lib/python2.7/SocketServer.py", line 290, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/lib/python2.7/SocketServer.py", line 318, in process_request
    self.finish_request(request, client_address)
  File "/usr/lib/python2.7/SocketServer.py", line 331, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.7/SocketServer.py", line 652, in __init__
    self.handle()
  File "/usr/lib/python2.7/wsgiref/simple_server.py", line 116, in handle
    self.raw_requestline = self.rfile.readline(65537)
  File "/usr/lib/python2.7/socket.py", line 480, in readline
    data = self._sock.recv(self._rbufsize)
KeyboardInterrupt
----------------------------------------

I analyzed the reloader code of werzkzeug: https://github.com/pallets/werkzeug/blob/master/werkzeug/_reloader.py

In that code the server is running in a thread and the code checking for changes in python modules and exiting the application for a new reload is in the main thread. It's the opposite of the current implementation in bottle. I made this changes in bottle and it works for me. I have no problem:

def run(app=None,
        server='wsgiref',
        host='127.0.0.1',
        port=8080,
        interval=1,
        reloader=False,
        quiet=False,
        plugins=None,
        debug=None,
        config=None, **kargs):
    """ Start a server instance. This method blocks until the server terminates.

        :param app: WSGI application or target string supported by
               :func:`load_app`. (default: :func:`default_app`)
        :param server: Server adapter to use. See :data:`server_names` keys
               for valid names or pass a :class:`ServerAdapter` subclass.
               (default: `wsgiref`)
        :param host: Server address to bind to. Pass ``0.0.0.0`` to listens on
               all interfaces including the external one. (default: 127.0.0.1)
        :param port: Server port to bind to. Values below 1024 require root
               privileges. (default: 8080)
        :param reloader: Start auto-reloading server? (default: False)
        :param interval: Auto-reloader interval in seconds (default: 1)
        :param quiet: Suppress output to stdout and stderr? (default: False)
        :param options: Options passed to the server adapter.
     """
    if NORUN: return
    if reloader and not os.environ.get('BOTTLE_CHILD'):
        import subprocess
        lockfile = None
        try:
            fd, lockfile = tempfile.mkstemp(prefix='bottle.', suffix='.lock')
            os.close(fd)  # We only need this file to exist. We never write to it
            while os.path.exists(lockfile):
                args = [sys.executable] + sys.argv
                environ = os.environ.copy()
                environ['BOTTLE_CHILD'] = 'true'
                environ['BOTTLE_LOCKFILE'] = lockfile
                p = subprocess.Popen(args, env=environ)
                while p.poll() is None:  # Busy wait...
                    os.utime(lockfile, None)  # I am alive!
                    time.sleep(interval)
                if p.poll() != 3:
                    if os.path.exists(lockfile): os.unlink(lockfile)
                    sys.exit(p.poll())
        except KeyboardInterrupt:
            pass
        finally:
            if os.path.exists(lockfile):
                os.unlink(lockfile)
        return

    try:
        if debug is not None: _debug(debug)
        app = app or default_app()
        if isinstance(app, basestring):
            app = load_app(app)
        if not callable(app):
            raise ValueError("Application is not callable: %r" % app)

        for plugin in plugins or []:
            if isinstance(plugin, basestring):
                plugin = load(plugin)
            app.install(plugin)

        if config:
            app.config.update(config)

        if server in server_names:
            server = server_names.get(server)
        if isinstance(server, basestring):
            server = load(server)
        if isinstance(server, type):
            server = server(host=host, port=port, **kargs)
        if not isinstance(server, ServerAdapter):
            raise ValueError("Unknown or unsupported server: %r" % server)

        server.quiet = server.quiet or quiet
        if not server.quiet:
            _stderr("Bottle v%s server starting up (using %s)...\n" %
                    (__version__, repr(server)))
            _stderr("Listening on http://%s:%d/\n" %
                    (server.host, server.port))
            _stderr("Hit Ctrl-C to quit.\n\n")

        if reloader:
            import signal

            signal.signal(signal.SIGTERM, lambda *args: sys.exit(0))

            t = threading.Thread(target=server.run, args=(app,))
            t.setDaemon(True)
            t.start()

            lockfile = os.environ.get('BOTTLE_LOCKFILE')

            exists = os.path.exists
            mtime = lambda p: os.stat(p).st_mtime
            files = dict()

            for module in list(sys.modules.values()):
                path = getattr(module, '__file__', '')
                if path[-4:] in ('.pyo', '.pyc'): path = path[:-1]
                if path and exists(path): files[path] = mtime(path)

            while True:
                if not exists(lockfile)\
                or mtime(lockfile) < time.time() - interval - 5:
                    sys.exit(3)

                for path, lmtime in list(files.items()):
                    if not exists(path) or mtime(path) > lmtime:
                        sys.exit(3)

                time.sleep(interval)

        else:
            server.run(app)
    except KeyboardInterrupt:
        pass
    except (SystemExit, MemoryError):
        raise
    except:
        if not reloader: raise
        if not getattr(server, 'quiet', quiet):
            print_exc()
        time.sleep(interval)
        sys.exit(3)

I deleted FileCheckerThread because now the code checking changes in python modules is in the main thread and the server is running in a thread:

            t = threading.Thread(target=server.run, args=(app,))
            t.setDaemon(True)
            t.start()
python2and3developer commented 6 years ago

I tried to research the origin of that prints in the terminal.

I did this in /usr/lib/python2.7:

$ grep -rn . -e "Exception happened during processing of request from"

The print of that exception is done here:

./SocketServer.py:348:        print 'Exception happened during processing of request from',

This is the code in SocketServer.py around line 348:

    def handle_error(self, request, client_address):
        """Handle an error gracefully.  May be overridden.

        The default is to print a traceback and continue.

        """
        print '-'*40
        print 'Exception happened during processing of request from',
        print client_address
        import traceback
        traceback.print_exc() # XXX But this goes to stderr!
        print '-'*40

This is all the inheritance chain:

WSGIServer --> HTTPServer --> TCPServer --> BaseServer

WSGIServer inherits from HTTPServer

class WSGIServer(HTTPServer):

as you can see in /usr/lib/python2.7/wsgiref/simple_server.py

HTTPServer inherits from SocketServer.TCPServer:

class HTTPServer(SocketServer.TCPServer):

as you can see here /usr/lib/python2.7/BaseHTTPServer.py

TCPServer inhertis from BaseServer.

And finally I was not wrong with my intuition. BaseServer is catching the exception raised by this call self.process_request(request, client_address):

    def _handle_request_noblock(self):
        """Handle one request, without blocking.

        I assume that select.select has returned that the socket is
        readable before this function was called, so there should be
        no risk of blocking in get_request().
        """
        try:
            request, client_address = self.get_request()
        except socket.error:
            return
        if self.verify_request(request, client_address):
            try:
                self.process_request(request, client_address)
            except:
                self.handle_error(request, client_address)
                self.shutdown_request(request)
        else:
            self.shutdown_request(request)

In that place, the KeyboardInterrupt exception is cached.

defnull commented 6 years ago

No server software should catch and ignore KeyboardInterrupt in the main thread. The exception represents a SIGINT signal (Ctrl+c) and catching it without shutting down the server would prevent a user from closing the server from command-line, also. Are you sure that you really cannot close a wsgiref-server with Ctrl+c? That's really strange.

defnull commented 6 years ago

Hmm, it catches KeyboardInterrupt while handling a request, but between requests it works. This is why this bug is not easily reproduce-able. I'd say it should count as a bug in BaseServer, but getting that fixed in 2.7 is very unlikely.

Starting the server in a thread different than main has other problems, unfortunately. For example, killing it by setting it to deamon and exiting the main-thread would ignore any exception handlers or cleanup code in the application itself :/

python2and3developer commented 6 years ago

KeyboardInterrupt is also captured in WSGIServer of python 3.6 when a requests is processed. This is the code for the same method _handle_request_noblock(self) of the class BaseServer in socketserver.pymodule, which is inherited by WSGIServer :

    def _handle_request_noblock(self):
        """Handle one request, without blocking.

        I assume that selector.select() has returned that the socket is
        readable before this function was called, so there should be no risk of
        blocking in get_request().
        """
        try:
            request, client_address = self.get_request()
        except OSError:
            return
        if self.verify_request(request, client_address):
            try:
                self.process_request(request, client_address)
            except Exception:
                self.handle_error(request, client_address)
                self.shutdown_request(request)
            except:
                self.shutdown_request(request)
                raise
        else:
            self.shutdown_request(request)

What do you think about this solution?

For example, a possible workaround for WSGIRefServer in bottle.py is adding these lines:

        if server_cls is WSGIServer:
            def handle_error(self, *args, **kwargs):
                if sys.exc_info()[0] ==  KeyboardInterrupt:
                    raise
                else:
                    return super(WSGIServer, self).handle_error(*args, **kwargs)

            server_cls.handle_error = handle_error

This is how the class WSGIRefServer looks after the workaround:

class WSGIRefServer(ServerAdapter):
    def run(self, app):  # pragma: no cover
        from wsgiref.simple_server import make_server
        from wsgiref.simple_server import WSGIRequestHandler, WSGIServer
        import socket

        class FixedHandler(WSGIRequestHandler):
            def address_string(self):  # Prevent reverse DNS lookups please.
                return self.client_address[0]

            def log_request(*args, **kw):
                if not self.quiet:
                    return WSGIRequestHandler.log_request(*args, **kw)

        handler_cls = self.options.get('handler_class', FixedHandler)
        server_cls = self.options.get('server_class', WSGIServer)
        if server_cls is WSGIServer:
            def handle_error(self, *args, **kwargs):
                if sys.exc_info()[0] ==  KeyboardInterrupt:
                    raise
                else:
                    return super(WSGIServer, self).handle_error(*args, **kwargs)

            server_cls.handle_error = handle_error

        if ':' in self.host:  # Fix wsgiref for IPv6 addresses.
            if getattr(server_cls, 'address_family') == socket.AF_INET:

                class server_cls(server_cls):
                    address_family = socket.AF_INET6

        self.srv = make_server(self.host, self.port, app, server_cls,
                               handler_cls)
        self.port = self.srv.server_port  # update port actual port (0 means random)
        try:
            self.srv.serve_forever()
        except KeyboardInterrupt:
            self.srv.server_close()  # Prevent ResourceWarning: unclosed socket
            raise
defnull commented 4 years ago

I really hesitate to merge a workaround (#1098) for an obvious bug in a different library. The core problem is that wsgiref catches and ignores KeyboardInterrupt in the main thread. Did you file a bug against python? What did they say?