mopemope / meinheld

Meinheld is a high performance asynchronous WSGI Web Server (based on picoev)
http://meinheld.org
Other
1.46k stars 103 forks source link

Meinheld coredump #85

Open bulletmark opened 7 years ago

bulletmark commented 7 years ago

My Python 3 app core dumped yesterday on a Raspberry Pi 2 running Arch Arm. Versions and coredump details are below.

pi2:~ uname -a
Linux pi2 4.9.43-1-ARCH #1 SMP Fri Aug 18 01:10:29 UTC 2017 armv7l GNU/Linux

pi2:~ python --version
Python 3.6.2

pi2:~ ~/Data/src/pialarm/env/bin/pip list --format=legacy
Beaker (1.9.0)
bottle (0.12.13)
bottle-cork (0.12.0)
greenlet (0.4.12)
meinheld (0.6.1)
pifaceio (1.26)
pip (9.0.1)
pycrypto (2.6.1)
ruamel.yaml (0.15.32)
setuptools (28.8.0)

pi2:~ coredumpctl gdb
           PID: 266 (pialarm)
           UID: 1000 (pi)
           GID: 1000 (pi)
        Signal: 11 (SEGV)
     Timestamp: Fri 2017-08-25 07:56:07 AEST (24h ago)
  Command Line: env/bin/python -u /home/pi/Data/src/pialarm/pialarm
    Executable: /usr/bin/python3.6
 Control Group: /system.slice/pialarm.service
          Unit: pialarm.service
         Slice: system.slice
       Boot ID: c1da2280d845469d8f3d91fe7252c78d
    Machine ID: d248d73f28ea419a864101496b8fce31
      Hostname: pi2
       Storage: /var/lib/systemd/coredump/core.pialarm.1000.c1da2280d845469d8f3d91fe7252c78d.266.1503611767000000.lz4
       Message: Process 266 (pialarm) of user 1000 dumped core.

                Stack trace of thread 266:
                #0  0x0000000074864d84 n/a (/home/pi/Data/src/pialarm/env/lib/python3.6/site-packages/meinheld/server.cpython-36m-arm-linux-gnueabihf.so)

GNU gdb (GDB) 8.0
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "armv7l-unknown-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python3.6...(no debugging symbols found)...done.

warning: core file may not match specified executable file.
[New LWP 266]
[New LWP 350]
[New LWP 352]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `env/bin/python -u /home/pi/Data/src/pialarm/pialarm'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  import_greenlet () at meinheld/server/greensupport.c:15
15  meinheld/server/greensupport.c: No such file or directory.
[Current thread is 1 (Thread 0x76f4d010 (LWP 266))]
(gdb) where
#0  import_greenlet () at meinheld/server/greensupport.c:15
#1  greenlet_new (o=0x748d10d0, parent=0x0)
    at meinheld/server/greensupport.c:31
#2  0x7485875c in call_wsgi_handler (client=<optimized out>)
    at meinheld/server/server.c:716
#3  0x7485b61c in accept_callback (loop=0x181bf00, fd=4, 
    events=<optimized out>, cb_arg=<optimized out>)
    at meinheld/server/server.c:1272
#4  0x74865b30 in picoev_poll_once_internal (_loop=_loop@entry=0x181bf00, 
    max_wait=<optimized out>) at meinheld/server/picoev_epoll.c:174
#5  0x7485c500 in picoev_loop_once (max_wait=<optimized out>, loop=0x181bf00)
    at meinheld/server/picoev.h:390
#6  meinheld_run_loop (self=<optimized out>, args=<optimized out>, 
    kwds=<optimized out>) at meinheld/server/server.c:1853
#7  0x76cbdb18 in _PyCFunction_FastCallDict ()
   from /usr/lib/libpython3.6m.so.1.0
#8  0x76d881e4 in ?? () from /usr/lib/libpython3.6m.so.1.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

Occurred after updating system and rebooting but unfortunately I can not repeat the crash so seems just a one-off atm. I am raising this issue so it is known about. This app has been running for many years although I frequently update system and PyPI packages though.

bulletmark commented 4 years ago

Just came to write about a bug and find one already existing - created 3 years ago by me(!).

I have had 3 more of these python core dumps occur over the last couple of months. All running on a different raspberry pi, with a completely different python app, using much later versions but fundamentally the same problem. Here are the details:

pi2b: $ uname -a
Linux pi2b 5.4.42-1-ARCH #1 SMP PREEMPT Tue May 26 01:48:52 UTC 2020 armv7l GNU/Linux

pi2b $ python --version
Python 3.8.3

pi2b: $ venv/bin/pip list
Package          Version
---------------- -------
bottle           0.12.18
greenlet         0.4.15
meinheld         1.0.1
pip              20.1
RPi.GPIO         0.7.0
ruamel.yaml      0.16.10
ruamel.yaml.clib 0.2.0
setuptools       41.2.0
timesched        1.5
wccontrol        1.13
wheel            0.34.2

pi2b: $gdb ./venv/bin/python ~/core.dump 
GNU gdb (GDB) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "armv7l-unknown-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./venv/bin/python...
(No debugging symbols found in ./venv/bin/python)
[New LWP 330]
[New LWP 340]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `venv/bin/python -u /home/pi/Data/src/wcscheduler/wcscheduler'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x756c5138 in import_greenlet () at meinheld/server/greensupport.c:14
14  meinheld/server/greensupport.c: No such file or directory.
[Current thread is 1 (Thread 0x76f57010 (LWP 330))]
(gdb) where
#0  0x756c5138 in import_greenlet () at meinheld/server/greensupport.c:14
#1  import_greenlet () at meinheld/server/greensupport.c:11
#2  greenlet_new (o=0x74fae460, parent=0x0) at meinheld/server/greensupport.c:31
#3  0x756b7640 in call_wsgi_handler (client=<optimized out>) at meinheld/server/server.c:717
#4  0x756bc1b8 in accept_callback (loop=0x135b950, fd=4, events=<optimized out>, cb_arg=<optimized out>)
    at meinheld/server/server.c:1274
#5  0x756c5ff8 in picoev_poll_once_internal (_loop=_loop@entry=0x135b950, max_wait=<optimized out>)
    at meinheld/server/picoev_epoll.c:173
#6  0x756bb30c in picoev_loop_once (max_wait=<optimized out>, loop=0x135b950)
    at meinheld/server/picoev.h:390
#7  meinheld_run_loop (self=<optimized out>, args=<optimized out>, kwds=<optimized out>)
    at meinheld/server/server.c:1857
#8  0x76c939c8 in ?? () from /usr/lib/libpython3.8.so.1.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
bulletmark commented 4 years ago

Looking back at my git history, I noticed that in Sep 2017 I changed my pialarm app (i.e. the app core dump in the first post above) from meinheld to bjoern due to experiencing another core dump and it has been fine ever since. So it seems I can not use meinheld reliably on any ARM Raspberry Pi due to this bug. Note I am using meinheld without issue on other apps on other platforms, e.g. stock x86_64.

bulletmark commented 2 years ago

Well as I say above, I have seen occasional core dumps from meinheld on ARM processors, but never one on x86_64 - until today! Just got the following core dump. which seems similar to the ones above:

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Core was generated by `venv/bin/python -u /home/mark/Data/src/pvoproxy/pvoproxy'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fc0899b97f8 in import_greenlet () at meinheld/server/greensupport.c:13
13  meinheld/server/greensupport.c: No such file or directory.
[Current thread is 1 (Thread 0x7fc08a17f740 (LWP 198))]
(gdb) where
#0  0x00007fc0899b97f8 in import_greenlet () at meinheld/server/greensupport.c:13
#1  import_greenlet () at meinheld/server/greensupport.c:10
#2  greenlet_new (o=0x7fc088cc9b30, parent=0x0) at meinheld/server/greensupport.c:25
#3  0x00007fc0899c24ef in call_wsgi_handler (client=<optimized out>) at meinheld/server/server.c:679
#4  0x00007fc0899c75ea in accept_callback (loop=0x55bbb8078540, fd=22, events=<optimized out>, cb_arg=<optimized out>)
    at meinheld/server/server.c:1205
#5  0x00007fc0899bf89f in picoev_poll_once_internal (_loop=_loop@entry=0x55bbb8078540, max_wait=<optimized out>)
    at meinheld/server/picoev_epoll.c:172
#6  0x00007fc0899c6838 in picoev_loop_once (max_wait=<optimized out>, loop=0x55bbb8078540) at meinheld/server/picoev.h:387
#7  meinheld_run_loop (self=<optimized out>, args=<optimized out>, kwds=<optimized out>) at meinheld/server/server.c:1739
#8  0x00007fc08a609ede in ?? () from /usr/lib/libpython3.9.so.1.0
#9  0x00007fc08a5f2333 in _PyObject_MakeTpCall () from /usr/lib/libpython3.9.so.1.0
#10 0x00007fc08a5ee218 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#11 0x00007fc08a5f996b in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#12 0x00007fc08a5e958e in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#13 0x00007fc08a5e7fd9 in ?? () from /usr/lib/libpython3.9.so.1.0
#14 0x00007fc08a5f9b8e in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#15 0x00007fc08a608f19 in PyObject_Call () from /usr/lib/libpython3.9.so.1.0
#16 0x00007fc08a5ebec9 in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#17 0x00007fc08a5e7fd9 in ?? () from /usr/lib/libpython3.9.so.1.0
#18 0x00007fc08a5f9b8e in _PyFunction_Vectorcall () from /usr/lib/libpython3.9.so.1.0
#19 0x00007fc08a6088a4 in ?? () from /usr/lib/libpython3.9.so.1.0
#20 0x00007fc08a5ea18b in _PyEval_EvalFrameDefault () from /usr/lib/libpython3.9.so.1.0
#21 0x00007fc08a5e7fd9 in ?? () from /usr/lib/libpython3.9.so.1.0
#22 0x00007fc08a5e7c41 in _PyEval_EvalCodeWithName () from /usr/lib/libpython3.9.so.1.0
#23 0x00007fc08a69eae3 in PyEval_EvalCode () from /usr/lib/libpython3.9.so.1.0
#24 0x00007fc08a6ae9f4 in ?? () from /usr/lib/libpython3.9.so.1.0
#25 0x00007fc08a6aa6cb in ?? () from /usr/lib/libpython3.9.so.1.0
#26 0x00007fc08a5582d3 in ?? () from /usr/lib/libpython3.9.so.1.0
#27 0x00007fc08a557761 in PyRun_SimpleFileExFlags () from /usr/lib/libpython3.9.so.1.0
#28 0x00007fc08a6c04a2 in Py_RunMain () from /usr/lib/libpython3.9.so.1.0
#29 0x00007fc08a691009 in Py_BytesMain () from /usr/lib/libpython3.9.so.1.0
#30 0x00007fc08a31cb25 in __libc_start_main () from /usr/lib/libc.so.6
#31 0x000055bbb75db04e in _start ()

This is obviously an extremely rare issue as I have been running meinheld for plenty of my 24x7 apps on regular x86_64 machines and virtual hosts for many years and never seen it before. I stopped using meinheld for my apps on ARM (E.g RPi) after my last comment above but this new issue dents my confidence in meinheld overall. :(

bulletmark commented 2 years ago

As I say above, I long ago gave up on meinheld on ARM platform given this issue but I have continued to use it on x86_64 machines. However, since Python 3.10, meinheld crashes frequently:

$ coredumpctl
TIME                          PID  UID  GID SIG     COREFILE EXE                 SIZE
Fri 2021-12-17 22:56:08 AEST  219 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Wed 2021-12-22 04:06:41 AEST  219 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Sat 2021-12-25 09:54:31 AEST  220 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Sat 2021-12-25 13:44:01 AEST 2187 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Sat 2021-12-25 22:44:10 AEST 2472 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Sun 2021-12-26 03:52:14 AEST 3065 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Sun 2021-12-26 19:56:21 AEST 3604 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Mon 2021-12-27 16:29:39 AEST  219 1000 1000 SIGSEGV missing  /usr/bin/python3.10  n/a
Wed 2021-12-29 10:59:45 AEST  228 1000 1000 SIGSEGV present  /usr/bin/python3.10 3.6M
Wed 2021-12-29 12:50:31 AEST  626 1000 1000 SIGSEGV present  /usr/bin/python3.10 3.6M
Thu 2021-12-30 06:38:22 AEST  807 1000 1000 SIGSEGV present  /usr/bin/python3.10 3.7M

I should raise another bug about this because the crash is different to the one described here but it seems meinheld is not really supported anymore. Note I actually do recall seeing 2 similar core dumps like this on x86_64 over the last few years (of 24x7 operation) but it was extremely rare. Python 3.10 seems to expose that bug much more. I am leaving this note for others but will migrate away from meinheld everywhere.