benoitc / gunicorn

gunicorn 'Green Unicorn' is a WSGI HTTP Server for UNIX, fast clients and sleepy applications.
http://www.gunicorn.org
Other
9.77k stars 1.75k forks source link

RuntimeError due to a warning being logged on SIGHUP #2564

Closed belawaeckerlig closed 1 year ago

belawaeckerlig commented 3 years ago

Hi,

gunicorn might crash due to a warning being logged in case of a SIGHUP. I think this was introduced by this commit: b695b497b9b8e7351808848ab9e342219601742a

Here is the Traceback of the error:

Traceback (most recent call last):
  File "/usr/lib/python3.9/logging/__init__.py", line 1082, in emit
    stream.write(msg + self.terminator)
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/logging/__init__.py", line 1082, in emit
    stream.write(msg + self.terminator)
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 530, in reap_workers
    self.log.warning(
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1454, in warning
    self._log(WARNING, msg, args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1585, in _log
    self.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1595, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1657, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 948, in handle
    self.emit(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1087, in emit
    self.handleError(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1000, in handleError
    sys.stderr.write('--- Logging error ---\n')
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.9/logging/__init__.py", line 1082, in emit
    stream.write(msg + self.terminator)
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 209, in run
    self.sleep()
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 357, in sleep
    ready = select.select([self.PIPE[0]], [], [], 1.0)
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 530, in reap_workers
    self.log.warning(
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1454, in warning
    self._log(WARNING, msg, args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1585, in _log
    self.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1595, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1657, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 948, in handle
    self.emit(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1087, in emit
    self.handleError(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1000, in handleError
    sys.stderr.write('--- Logging error ---\n')
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/arbiter.py", line 530, in reap_workers
    self.log.warning(
  File "/usr/local/lib/python3.9/dist-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1454, in warning
    self._log(WARNING, msg, args, **kwargs)
  File "/usr/lib/python3.9/logging/__init__.py", line 1585, in _log
    self.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1595, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1657, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 948, in handle
    self.emit(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1087, in emit
    self.handleError(record)
  File "/usr/lib/python3.9/logging/__init__.py", line 1000, in handleError
    sys.stderr.write('--- Logging error ---\n')
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>

This won't happen always.

derekbekoe commented 3 years ago

I am seeing the same thing with gunicorn==20.1.0, python 3.8.2.

Traceback:

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1085, in emit
    self.flush()
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1065, in flush
    self.stream.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
Call stack:
  File "/opt/.venv/bin/gunicorn", line 11, in <module>
    load_entry_point('gunicorn==20.1.0', 'console_scripts', 'gunicorn')()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 67, in run
    WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/app/base.py", line 231, in run
    super().run()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/app/base.py", line 72, in run
    Arbiter(self).run()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 202, in run
    self.manage_workers()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 551, in manage_workers
    self.spawn_workers()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 622, in spawn_workers
    self.spawn_worker()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 589, in spawn_worker
    worker.init_process()
  File "/opt/.venv/lib/python3.8/site-packages/uvicorn/workers.py", line 61, in init_process
    super(UvicornWorker, self).init_process()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/workers/base.py", line 142, in init_process
    self.run()
  File "/opt/.venv/lib/python3.8/site-packages/uvicorn/workers.py", line 70, in run
    loop.run_until_complete(server.serve(sockets=self.sockets))
  File "/opt/.venv/lib/python3.8/site-packages/uvicorn/protocols/http/httptools_impl.py", line 390, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/opt/.venv/lib/python3.8/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/opt/.venv/lib/python3.8/site-packages/fastapi/applications.py", line 146, in __call__
    await super().__call__(scope, receive, send)
  File "/opt/.venv/lib/python3.8/site-packages/starlette/applications.py", line 102, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/.venv/lib/python3.8/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/opt/.venv/lib/python3.8/site-packages/starlette/middleware/base.py", line 25, in __call__
    response = await self.dispatch_func(request, self.call_next)
  File "/opt/main.py", line 96, in _app_request
    log.msg(
  File "/opt/.venv/lib/python3.8/site-packages/structlog/_log_levels.py", line 118, in meth
    return self._proxy_to_logger(name, event, **kw)
  File "/opt/.venv/lib/python3.8/site-packages/structlog/_base.py", line 199, in _proxy_to_logger
    return getattr(self._logger, method_name)(*args, **kw)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1434, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1577, in _log
    self.handle(record)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1587, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 950, in handle
    self.emit(record)
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1085, in emit
    self.flush()
  File "/usr/local/lib/python3.8/logging/__init__.py", line 1065, in flush
    self.stream.flush()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/arbiter.py", line 530, in reap_workers
    self.log.warning(
  File "/opt/.venv/lib/python3.8/site-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
Message: 'Worker with pid %s was terminated due to signal %s'
Arguments: (1378, 9)
onefor1 commented 3 years ago

the same

onefor1 commented 3 years ago

The final solution: Allocate more memory for the pod

see this issue: https://github.com/benoitc/gunicorn/issues/2215

belawaeckerlig commented 3 years ago

@onefor1 Are you sure you encountered the same issue? The issue you linked seems to be quite different and I'm not using kubernetes. It also seems to be quite strange, that one additional log.warning call is causing an OOM

onefor1 commented 3 years ago

@onefor1 Are you sure you encountered the same issue? The issue you linked seems to be quite different and I'm not using kubernetes. It also seems to be quite strange, that one additional log.warning call is causing an OOM

At least that's how I solved the problem. You can try

adam-sikora commented 3 years ago

I'm getting the same error. gunicorn==20.1.0, python 3.8.5.

Increasing the memory for the pod did not help in my case.

shakna-israel commented 2 years ago

I am seeing this with near identical logs for gunicorn==20.1.0 and python 3.9.2 on Debian Debian 5.10.46-3. It's way too intermittent to actually predict when it will hit. But about once every 72 hours or so.

BoasKluiving commented 2 years ago

I am seeing the same error after upgrading gunicorn to 20.1.0 (previously was running version 20.0.4). I believe the issue is happening because of the SIGCHLD signal handler logging a message, which is not reentrant. The error occurs if the SIGCHLD interrupt happens at the same time it is logging something, which is why the error only occurs sometimes. I have proposed a fix in the following PR: #2667

cb109 commented 2 years ago

We are seeing this issue every now and then and would very much like it to get fixed.

Usually starts with a lot of [CRITICAL] WORKER TIMEOUT (pid:64248) and [WARNING] Worker with pid 64177 was terminated due to signal 9, then we'll see the logging error:

--- Logging error ---
Traceback (most recent call last):
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1029, in emit
    self.flush()
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1009, in flush
    self.stream.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
Call stack:
  File "/opt/project/.venv/bin/gunicorn", line 8, in <module>
    sys.exit(run())
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/app/wsgiapp.py", line 67, in run
    WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/app/base.py", line 231, in run
    super().run()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/app/base.py", line 72, in run
    Arbiter(self).run()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 211, in run
    self.manage_workers()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 551, in manage_workers
    self.spawn_workers()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 623, in spawn_workers
    time.sleep(0.1 * random.random())
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 533, in reap_workers
    os.WTERMSIG(status)
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1390, in warning
    self._log(WARNING, msg, args, **kwargs)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1514, in _log
    self.handle(record)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1524, in handle
    self.callHandlers(record)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1586, in callHandlers
    hdlr.handle(record)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 894, in handle
    self.emit(record)
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1029, in emit
    self.flush()
  File "/opt/pyenv/versions/3.7.9/lib/python3.7/logging/__init__.py", line 1009, in flush
    self.stream.flush()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
    self.reap_workers()
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/arbiter.py", line 533, in reap_workers
    os.WTERMSIG(status)
  File "/opt/project/.venv/lib/python3.7/site-packages/gunicorn/glogging.py", line 261, in warning
    self.error_log.warning(msg, *args, **kwargs)
Message: 'Worker with pid %s was terminated due to signal %s'
Arguments: (64247, 9)

and at some point gunicorn will drop out with [INFO] Unhandled exception in main loop After that the logs are full of RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'> and the service is basically dead.

We are on Python 3.7.9, gunicorn 20.1.0, using ~50 workers.

benoitc commented 2 years ago

b695b497b9b8e7351808848ab9e342219601742a. has been reverted which shoudl indeed fix the error . More generally speaking we should find a way to handle the write of the errors to not allows concurrent writes imo.

Please test master if you can and let me know if you still reproduce the issue. I am also interrested by a test case as I don't reproduced it myself yet.

cb109 commented 2 years ago

@benoitc Thank you for the swift reaction! Let me see if I can reproduce it somehow

cb109 commented 2 years ago

I fiddled around with it a bit, building on existing tests + threading, but was unsuccessful to reproduce the problem for now. I think it may be worthwile setting up a more realistic test environment, where the gunicorn process and its workers are run similar to production, without mocking out too many things.

I have installed https://github.com/benoitc/gunicorn/commit/76f8da24cbb992d168e01bda811452bcf3b8f5b3 to our server and will observe if we see the problem again, will set a reminder to get back here in a while 👋

cb109 commented 2 years ago

I cannot find any errors in our logs like above with the "fixed" version, so I think this has solved the issue :+1:

sppaskal commented 2 years ago

We are still seeing the same issue running our Django web app using gunicorn 20.1.0 via Azure.

[2022-01-26 19:41:47 +0000] [39] [INFO] Starting gunicorn 20.1.0
2022-01-26T19:41:47.624891557Z [2022-01-26 19:41:47 +0000] [39] [INFO] Listening at: http://0.0.0.0:8000 (39)
2022-01-26T19:41:47.624901857Z [2022-01-26 19:41:47 +0000] [39] [INFO] Using worker: sync
2022-01-26T19:41:47.624908358Z [2022-01-26 19:41:47 +0000] [41] [INFO] Booting worker with pid: 41
2022-01-26T19:41:47.624914558Z [2022-01-26 19:41:47 +0000] [42] [INFO] Booting worker with pid: 42
2022-01-26T19:41:47.769436625Z [2022-01-26 19:41:47 +0000] [43] [INFO] Booting worker with pid: 43
2022-01-26T19:41:47.817718684Z [2022-01-26 19:41:47 +0000] [44] [INFO] Booting worker with pid: 44
2022-01-26T19:42:48.036818037Z [2022-01-26 19:42:47 +0000] [39] [CRITICAL] WORKER TIMEOUT (pid:41)
2022-01-26T19:42:48.151129815Z [2022-01-26 19:42:48 +0000] [39] [CRITICAL] WORKER TIMEOUT (pid:42)
2022-01-26T19:42:48.151162116Z [2022-01-26 19:42:48 +0000] [39] [CRITICAL] WORKER TIMEOUT (pid:43)
2022-01-26T19:42:48.151168317Z [2022-01-26 19:42:48 +0000] [39] [CRITICAL] WORKER TIMEOUT (pid:44)
2022-01-26T19:42:48.151173617Z [2022-01-26 19:42:48 +0000] [44] [INFO] Worker exiting (pid: 44)
2022-01-26T19:42:48.151178717Z [2022-01-26 19:42:48 +0000] [41] [INFO] Worker exiting (pid: 41)
2022-01-26T19:42:48.236634567Z [2022-01-26 19:42:48 +0000] [43] [INFO] Worker exiting (pid: 43)
2022-01-26T19:42:48.238949941Z [2022-01-26 19:42:48 +0000] [42] [INFO] Worker exiting (pid: 42)
2022-01-26T19:42:49.121452040Z [2022-01-26 19:42:49 +0000] [39] [WARNING] Worker with pid 42 was terminated due to signal 9
2022-01-26T19:42:49.759609777Z --- Logging error ---
2022-01-26T19:42:49.759642078Z Traceback (most recent call last):
2022-01-26T19:42:49.759648778Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1085, in emit
2022-01-26T19:42:49.759654078Z     self.flush()
2022-01-26T19:42:49.759658678Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1065, in flush
2022-01-26T19:42:49.759663579Z     self.stream.flush()
2022-01-26T19:42:49.759668079Z RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
2022-01-26T19:42:49.759673779Z
2022-01-26T19:42:49.759678579Z During handling of the above exception, another exception occurred:
2022-01-26T19:42:49.759683179Z
2022-01-26T19:42:49.759687779Z Traceback (most recent call last):
2022-01-26T19:42:49.759692279Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1085, in emit
2022-01-26T19:42:49.759697080Z     self.flush()
2022-01-26T19:42:49.759701680Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1065, in flush
2022-01-26T19:42:49.759706480Z     self.stream.flush()
2022-01-26T19:42:49.759710880Z RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
2022-01-26T19:42:49.759715780Z Call stack:
2022-01-26T19:42:49.759720180Z   File "/opt/python/3/bin/gunicorn", line 8, in <module>
2022-01-26T19:42:49.759725381Z     sys.exit(run())
2022-01-26T19:42:49.759730081Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/app/wsgiapp.py", line 67, in run
2022-01-26T19:42:49.759734981Z     WSGIApplication("%(prog)s [OPTIONS] [APP_MODULE]").run()
2022-01-26T19:42:49.759739681Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/app/base.py", line 231, in run
2022-01-26T19:42:49.759744581Z     super().run()
2022-01-26T19:42:49.759761682Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/app/base.py", line 72, in run
2022-01-26T19:42:49.759766382Z     Arbiter(self).run()
2022-01-26T19:42:49.759770482Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 209, in run
2022-01-26T19:42:49.759774882Z     self.sleep()
2022-01-26T19:42:49.759778782Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 357, in sleep
2022-01-26T19:42:49.759783182Z     ready = select.select([self.PIPE[0]], [], [], 1.0)
2022-01-26T19:42:49.759787282Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
2022-01-26T19:42:49.759791583Z     self.reap_workers()
2022-01-26T19:42:49.759795583Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 530, in reap_workers
2022-01-26T19:42:49.759799983Z     self.log.warning(
2022-01-26T19:42:49.759803983Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/glogging.py", line 261, in warning
2022-01-26T19:42:49.759808383Z     self.error_log.warning(msg, *args, **kwargs)
2022-01-26T19:42:49.759812383Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1446, in warning
2022-01-26T19:42:49.759816683Z     self._log(WARNING, msg, args, **kwargs)
2022-01-26T19:42:49.759821684Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1577, in _log
2022-01-26T19:42:49.759826084Z     self.handle(record)
2022-01-26T19:42:49.759830084Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1587, in handle
2022-01-26T19:42:49.759834384Z     self.callHandlers(record)
2022-01-26T19:42:49.759838384Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
2022-01-26T19:42:49.759842684Z     hdlr.handle(record)
2022-01-26T19:42:49.759846584Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 950, in handle
2022-01-26T19:42:49.759850885Z     self.emit(record)
2022-01-26T19:42:49.759854785Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1085, in emit
2022-01-26T19:42:49.759859085Z     self.flush()
2022-01-26T19:42:49.759862985Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1065, in flush
2022-01-26T19:42:49.759867285Z     self.stream.flush()
2022-01-26T19:42:49.759871285Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
2022-01-26T19:42:49.759875585Z     self.reap_workers()
2022-01-26T19:42:49.759879585Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 530, in reap_workers
2022-01-26T19:42:49.759883986Z     self.log.warning(
2022-01-26T19:42:49.759887986Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/glogging.py", line 261, in warning
2022-01-26T19:42:49.759895986Z     self.error_log.warning(msg, *args, **kwargs)
2022-01-26T19:42:49.759900386Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1446, in warning
2022-01-26T19:42:49.759904586Z     self._log(WARNING, msg, args, **kwargs)
2022-01-26T19:42:49.759908786Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1577, in _log
2022-01-26T19:42:49.759912987Z     self.handle(record)
2022-01-26T19:42:49.759916987Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1587, in handle
2022-01-26T19:42:49.759921287Z     self.callHandlers(record)
2022-01-26T19:42:49.759925287Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1649, in callHandlers
2022-01-26T19:42:49.759929587Z     hdlr.handle(record)
2022-01-26T19:42:49.759933587Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 950, in handle
2022-01-26T19:42:49.759937787Z     self.emit(record)
2022-01-26T19:42:49.759941787Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1089, in emit
2022-01-26T19:42:49.759946088Z     self.handleError(record)
2022-01-26T19:42:49.759950088Z   File "/opt/python/3.8.6/lib/python3.8/logging/__init__.py", line 1003, in handleError
2022-01-26T19:42:49.759954388Z     traceback.print_exception(t, v, tb, None, sys.stderr)
2022-01-26T19:42:49.759958488Z   File "/opt/python/3.8.6/lib/python3.8/traceback.py", line 103, in print_exception
2022-01-26T19:42:49.759962788Z     for line in TracebackException(
2022-01-26T19:42:49.759966788Z   File "/opt/python/3.8.6/lib/python3.8/traceback.py", line 509, in __init__
2022-01-26T19:42:49.759971088Z     self.stack = StackSummary.extract(
2022-01-26T19:42:49.759975489Z   File "/opt/python/3.8.6/lib/python3.8/traceback.py", line 366, in extract
2022-01-26T19:42:49.759979889Z     f.line
2022-01-26T19:42:49.759985189Z   File "/opt/python/3.8.6/lib/python3.8/traceback.py", line 288, in line
2022-01-26T19:42:49.759989589Z     self._line = linecache.getline(self.filename, self.lineno).strip()
2022-01-26T19:42:49.759993789Z   File "/opt/python/3.8.6/lib/python3.8/linecache.py", line 16, in getline
2022-01-26T19:42:49.759998089Z     lines = getlines(filename, module_globals)
2022-01-26T19:42:49.760002089Z   File "/opt/python/3.8.6/lib/python3.8/linecache.py", line 47, in getlines
2022-01-26T19:42:49.760006390Z     return updatecache(filename, module_globals)
2022-01-26T19:42:49.760010490Z   File "/opt/python/3.8.6/lib/python3.8/linecache.py", line 136, in updatecache
2022-01-26T19:42:49.760014790Z     with tokenize.open(fullname) as fp:
2022-01-26T19:42:49.760018890Z   File "/opt/python/3.8.6/lib/python3.8/tokenize.py", line 394, in open
2022-01-26T19:42:49.760023190Z     encoding, lines = detect_encoding(buffer.readline)
2022-01-26T19:42:49.760027290Z   File "/opt/python/3.8.6/lib/python3.8/tokenize.py", line 363, in detect_encoding
2022-01-26T19:42:49.760034790Z     first = read_or_stop()
2022-01-26T19:42:49.760038991Z   File "/opt/python/3.8.6/lib/python3.8/tokenize.py", line 321, in read_or_stop
2022-01-26T19:42:49.760043291Z     return readline()
2022-01-26T19:42:49.760047391Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 242, in handle_chld
2022-01-26T19:42:49.760051791Z     self.reap_workers()
2022-01-26T19:42:49.760055791Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/arbiter.py", line 530, in reap_workers
2022-01-26T19:42:49.760060091Z     self.log.warning(
2022-01-26T19:42:49.760064091Z   File "/opt/python/3.8.6/lib/python3.8/site-packages/gunicorn/glogging.py", line 261, in warning
2022-01-26T19:42:49.760068392Z     self.error_log.warning(msg, *args, **kwargs)
2022-01-26T19:42:49.760072492Z Message: 'Worker with pid %s was terminated due to signal %s'
2022-01-26T19:42:49.760076592Z Arguments: (44, 9)

The traceback goes on for a while after what I've pasted here with each error being a RuntimeError: reentrant call inside <_io.BufferedWriter name=''>

As mentioned this does not occur every time but when the app is being redeployed or reloaded, it tends to happen fairly often. Is the solution discussed above live in 2.1.0?

cb109 commented 2 years ago

@sppaskal no, there has been no release yet (https://github.com/benoitc/gunicorn/tags). you can install directly from master though or even pinned to the exact commit if you need it quickly (https://stackoverflow.com/a/13754517).

sppaskal commented 2 years ago

@cb109 Alright. Thank you for taking the time!

benoitc commented 2 years ago

the release will land tomorrow .

andreas-p commented 2 years ago

My proposal is to log only if the situation is unexpected. After a WORKER TIMEOUT (worker.aborted=True), a SIGKILL is perfectly excpected (and already logged) and needs no further logging. This will probably vastly reduce occurrences of reentrant logger calls, but not completely avoid them. Maybe it's enough to have a threading.Lock in reap_workers() when logging to catch multiple child signals? Or will it create a deadlock?

TiradoMartin commented 1 year ago

En mi caso pude resolver ese problema aumentando mi memoria RAM de mi VPS de DIGITALOCEAN

reallistic commented 1 year ago

@benoitc any update on when the release will land?

GalBr commented 1 year ago

@benoitc I would love to get an update for when we can expect a new release with the fix for this issue. Thanks!

a-wiseguy commented 1 year ago

For anyone recently dealing with this issue as i did (Python 3.9, Azure app service)

The issue most of us are having is that the Gunicorn workers timeout during startup, usually on slower cloud containers as i experienced on Azure App Services. This is also why it can happen occasionally.

Check your defaults and ensure that your Gunicorn timeout is set long enough to start the workers.

In my case for fastapi/uvicorn increasing timeout to 600 using a custom app service startup script fixed it:

gunicorn -w 4 -k uvicorn.workers.UvicornWorker main:app --timeout 600

Doesn't solve the logging reentrant bug, but for anyone here waiting for a fix this is probably secondary to the real problem anyway.

bengabp commented 7 months ago

Still getting thesame error

image

karpikpl commented 5 months ago

Same error - also on azure app service - gunicorn 20.1.0

2024-02-16T10:58:29.794815796Z    _____                              
2024-02-16T10:58:29.794847597Z   /  _  \ __________ _________   ____  
2024-02-16T10:58:29.794852497Z  /  /_\  \\___   /  |  \_  __ \_/ __ \
2024-02-16T10:58:29.794855697Z /    |    \/    /|  |  /|  | \/\  ___/
2024-02-16T10:58:29.794858797Z \____|__  /_____ \____/ |__|    \___  >
2024-02-16T10:58:29.794861997Z         \/      \/                  \/
2024-02-16T10:58:29.794864997Z A P P   S E R V I C E   O N   L I N U X
2024-02-16T10:58:29.794867897Z
2024-02-16T10:58:29.794870598Z Documentation: http://aka.ms/webapp-linux
2024-02-16T10:58:29.794873498Z Python 3.9.17
2024-02-16T10:58:29.794876398Z Note: Any data outside '/home' is not persisted
2024-02-16T10:58:30.479810451Z Starting OpenBSD Secure Shell server: sshd.
2024-02-16T10:58:30.551552902Z Site's appCommandLine: gunicorn -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8081 app.main:app
2024-02-16T10:58:30.552242816Z Launching oryx with: create-script -appPath /home/site/wwwroot -output /opt/startup/startup.sh -virtualEnvName antenv -defaultApp /opt/defaultsite -userStartupCommand 'gunicorn -w 4 -k uvicorn.workers.UvicornWorker -b 0.0.0.0:8081 app.main:app'
2024-02-16T10:58:30.612826042Z Could not find build manifest file at '/home/site/wwwroot/oryx-manifest.toml'
2024-02-16T10:58:30.613268451Z Could not find operation ID in manifest. Generating an operation id...
2024-02-16T10:58:30.615380593Z Build Operation ID: 11cb7e79-3d83-4a87-ba73-eac5761155c5
2024-02-16T10:58:30.938113521Z Oryx Version: 0.2.20230707.1, Commit: 0bd28e69919b5e8beba451e8677e3345f0be8361, ReleaseTagName: 20230707.1
2024-02-16T10:58:30.942658913Z Writing output script to '/opt/startup/startup.sh'
2024-02-16T10:58:31.050100286Z Using packages from virtual environment antenv located at /home/site/wwwroot/antenv.
2024-02-16T10:58:31.050577996Z Updated PYTHONPATH to '/opt/startup/app_logs:/home/site/wwwroot/antenv/lib/python3.9/site-packages'
2024-02-16T10:58:33.569735869Z [2024-02-16 10:58:33 +0000] [66] [INFO] Starting gunicorn 20.1.0
2024-02-16T10:58:33.570650488Z [2024-02-16 10:58:33 +0000] [66] [INFO] Listening at: http://0.0.0.0:8081/ (66)
2024-02-16T10:58:33.571028395Z [2024-02-16 10:58:33 +0000] [66] [INFO] Using worker: uvicorn.workers.UvicornWorker
2024-02-16T10:58:33.575806492Z [2024-02-16 10:58:33 +0000] [70] [INFO] Booting worker with pid: 70
2024-02-16T10:58:33.638912766Z [2024-02-16 10:58:33 +0000] [71] [INFO] Booting worker with pid: 71
2024-02-16T10:58:33.690871815Z [2024-02-16 10:58:33 +0000] [72] [INFO] Booting worker with pid: 72
2024-02-16T10:58:33.755318216Z [2024-02-16 10:58:33 +0000] [73] [INFO] Booting worker with pid: 73
2024-02-16T10:59:03.883512027Z [2024-02-16 10:59:03 +0000] [66] [CRITICAL] WORKER TIMEOUT (pid:70)
2024-02-16T10:59:03.883546828Z [2024-02-16 10:59:03 +0000] [66] [CRITICAL] WORKER TIMEOUT (pid:71)
2024-02-16T10:59:03.883552628Z [2024-02-16 10:59:03 +0000] [66] [CRITICAL] WORKER TIMEOUT (pid:72)
2024-02-16T10:59:03.883565728Z [2024-02-16 10:59:03 +0000] [66] [CRITICAL] WORKER TIMEOUT (pid:73)
2024-02-16T10:59:04.933766403Z [2024-02-16 10:59:04 +0000] [66] [WARNING] Worker with pid 71 was terminated due to signal 9
2024-02-16T10:59:04.937299075Z --- Logging error ---
2024-02-16T10:59:04.945800347Z [2024-02-16 10:59:04 +0000] [66] [INFO] Unhandled exception in main loop
2024-02-16T10:59:04.945820947Z Traceback (most recent call last):
2024-02-16T10:59:04.945826447Z   File "/opt/python/3.9.17/lib/python3.9/logging/__init__.py", line 1086, in emit
2024-02-16T10:59:04.945830547Z     stream.write(msg + self.terminator)
2024-02-16T10:59:04.946037352Z RuntimeError: reentrant call inside <_io.BufferedWriter name='<stderr>'>
Yangeok commented 4 months ago

Is the reason machine memory allocation exceeded?