rumpkernel / rumprun-packages

Ready-made packages of software for running on the Rumprun unikernel
Other
202 stars 79 forks source link

Kernel panic caused by diagnostic assertion in PHP #90

Open JelteF opened 8 years ago

JelteF commented 8 years ago

I'm getting a kernel panic after putting some load on my PHP kernel for a while. Can you give me some helpin debugging this issue?

=== calling "../bin/php-cgi.bin" main() ===

rumprun: call to ``_sys___sigprocmask14'' ignored
rumprun: call to ``sigaction'' ignored
panic: kernel diagnostic assertion "ph == NULL || ((pp->pr_roflags & PR_PHINPAGE) != 0) || ((char *)ph->ph_page <= (char *)v && (char *)v < (char *)ph->ph_page + pp->pr_alloc->pa_pagesz)" failed: file "/root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../
rump kernel halting...
PANIC: rumpuser panic
port 4 still bound!
port 5 still bound!
minios: halting, reason=0
anttikantee commented 8 years ago

Smells like memory corruption.

But for starters, can you post the backtrace? Instructions are here: http://wiki.rumpkernel.org/Howto%3A-Debugging-Rumprun-with-gdb

Incognito commented 8 years ago

Are you running PHP as a standalone server? I'm curious if this issue is in PHP.

JelteF commented 8 years ago

I'm runnig PHP behind Nginx and it connects to MySQL using mysql_pconnect

JelteF commented 8 years ago

@anttikantee When I try the instructions there I get:

+ rumprun xen -D 2222 -di -M 128 -I net1,xenif -W net1,inet,static,145.100.105.229/24 -b ../images/data.iso,/data -e PHP_FCGI_MAX_REQUESTS=0 -- ../bin/php-cgi.bin -b 8000

!!!
!!! NOTE: rumprun is experimental. syntax may change in the future
!!!

/root/rumprun/rumprun/bin/rumprun: 508: /root/rumprun/rumprun/bin/rumprun: gdbsx: parameter not set

Any clue on how to fix that?

anttikantee commented 8 years ago

@JelteF yes, see https://github.com/rumpkernel/rumprun/blob/master/app-tools/rumprun#L435-L442

So essentially you must teach the script where to find gdbsx from. If you want to submit a patch to fix that, maybe it should also be possible for the user to override the search by setting $GDBSX.

btw, in my experience the Xen debugger backend (gdbsx) does not work very well. I'd rather suggest trying to debug the problem under qemu/kvm, if possible -- you may get a better backtrace that way.

JelteF commented 8 years ago

I finally got a backtrace for this:

NU gdb (Ubuntu 7.10-1ubuntu2) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ../bin/php-cgi.bin...done.
Remote debugging using :2012
0x0000000000102669 in hlt () at include/arch/x86/inline.h:59
59              __asm__ __volatile__("hlt");
(gdb) bt
#0  0x0000000000102669 in hlt () at include/arch/x86/inline.h:59
#1  bmk_platform_halt (panicstring=panicstring@entry=0xb02030 "rumpuser panic") at kernel.c:59
#2  0x000000000010cfba in rumpuser_exit (value=value@entry=-1) at /root/rumprun/lib/libbmk_rumpuser/rumpuser_base.c:137
#3  0x0000000000861b27 in cpu_reboot (howto=<optimized out>, bootstr=<optimized out>) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/emul.c:393
#4  0x0000000000829118 in vpanic (fmt=0xb7db08 "kernel %sassertion \"%s\" failed: file \"%s\", line %d ", ap=0x7f17dc8) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_prf.c:342
#5  0x0000000000812f82 in kern_assert (fmt=<optimized out>) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../lib/libkern/kern_assert.c:51
#6  0x000000000082c294 in pr_find_pagehead (v=0x7520000, pp=0xe72240) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:352
#7  pool_do_put (pq=0x7f17e28, v=0x7520000, pp=0xe72240) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:980
#8  pool_put (pp=pp@entry=0xe72240, v=v@entry=0x7520000) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:1077
#9  0x000000000082c95d in pool_cache_destruct_object1 (object=0x7520000, pc=0xe72240) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:1954
#10 pool_cache_invalidate_groups (pc=pc@entry=0xe72240, pcg=pcg@entry=0xe7eb00) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:1989
#11 0x000000000082d76d in pool_cache_invalidate (pc=0xe72240) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:2055
#12 0x000000000082d857 in pool_reclaim (pp=pp@entry=0xe72240) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:1360
#13 0x000000000082db1d in pool_drain (ppp=0x7f17f88) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/../../../kern/subr_pool.c:1445
#14 0x000000000085d614 in uvm_pageout (arg=<optimized out>) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/vm.c:1161
#15 0x000000000085e23e in threadbouncer (arg=0xe7a2d0) at /root/rumprun/src-netbsd/sys/rump/librump/rumpkern/threads.c:90
#16 0x000000000010d7c8 in bmk_cpu_sched_bouncer ()
#17 0x0000000000000000 in ?? ()
(gdb) 
anttikantee commented 8 years ago

Ok, so with kvm (or qemu). In case it's easy to repeat, do you always see uvm_pageout() in the stack trace?

JelteF commented 8 years ago

yes with kvm. I will try to repeat it again (since I'm also trying to get mysql to crash with kvm). I'm guessing the problem is the same though, as it causes the same output as the previous crash.

JelteF commented 8 years ago

I got the crash once more and the backtrace was exactly the same. One interesting thing is that we load balance our tests (using an nginx unikernel) to 5 php unikernels and they all crash at the same moment.

Also we found out that the default memory limit of php is 128M (using phpinfo.php) and the unikernel is assigned 128M in the default script (which we use). So that might be the issue here :).

JelteF commented 8 years ago

Setting the memory -M option to 160 did not fix the crash. We will try tomorrow with 256.

Incognito commented 8 years ago

For the record, in PHP (when running on Linux or something) I've seen "PHP Fatal error: Out of memory" when physical memory is exhausted, and a different error when it's a config thing ("Fatal error: Allowed memory size of 12582912 bytes exhausted (tried to allocate 23456789 bytes.... ")

anttikantee commented 8 years ago

I'd try with something like -M 512. There are various degrees of overhead, so having 256MB of memory does not provide php with 128MB. (some of them could me optimized, but requires finishing the work which allows memory to be shared between the application and system)