openresty / luajit2

OpenResty's Branch of LuaJIT 2
https://luajit.org/luajit.html
Other
1.26k stars 201 forks source link

segment fault core dump on mips64 #103

Open honghaier250 opened 4 years ago

honghaier250 commented 4 years ago

version openresty-1.17.8.2

root@loongson:/# lscpu
Architecture:          mips64
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             2
NUMA node(s):          4
Model name:            Loongson-3A R3 (Loongson-3A3000) @ 1386MHz
CPU max MHz:           1385.9800
CPU min MHz:           692.9900
BogoMIPS:              2770.79
Hypervisor vendor:     vertical
Virtualization type:   full
L1d cache:             64K
L1i cache:             64K
L2 cache:              256K
L3 cache:              2048K
NUMA node0 CPU(s):     0-3
NUMA node1 CPU(s):     4-7
NUMA node2 CPU(s):
NUMA node3 CPU(s):

nginx.conf

user  root;
worker_processes  1;
error_log  logs/error.log  debug;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    sendfile        on;
    keepalive_timeout  65;

    server {
        listen       80;
        server_name  localhost;
        resolver 8.8.8.8;

        location / {

            content_by_lua_block {

              local http = require "resty.http"
              local httpc = http.new()
              local res, err = httpc:request_uri("http://example.com/helloworld", {
                method = "POST",
                body = "a=1&b=2",
                headers = {
                  ["Content-Type"] = "application/x-www-form-urlencoded",
                },
                keepalive_timeout = 60000,
                keepalive_pool = 10
              })

              if not res then
                ngx.say("failed to request: ", err)
                return
              end

              ngx.status = res.status
              ngx.say(res.body)
            }
        }
    }
}

Use the following command to reproduce the problem

curl -vkl http://127.0.0.1/
honghaier250 commented 4 years ago

@lukego @leafo @ZoomQuiet @kindy @agentzh @meathill @bungle @siddhesh

agentzh commented 4 years ago

@274914765 You should at least use gdb to obtain a full backtrace (via the gdb command bt full) from your core dump file.

honghaier250 commented 4 years ago

@agentzh

root@localhost:/openresty-1.17.8.2/bundle/LuaJIT-2.1-20200102# gdb --args /kssl/TRP/nginx/sbin/nginx -p /kssl/TRP/nginx/ -c conf/nginx.conf -g "daemon off; master_process off;"
GNU gdb (Debian 7.12-6) 7.12.0.20161007-git
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64el-linux-gnuabi64".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /kssl/TRP/nginx/sbin/nginx...done.
(gdb) r
Starting program: /kssl/TRP/nginx/sbin/nginx -p /kssl/TRP/nginx/ -c conf/nginx.conf -g daemon\ off\;\ master_process\ off\;
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/mips64el-linux-gnuabi64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
ngx_http_lua_pcre_malloc_done (old_pool=0x0) at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb) bt
#0  ngx_http_lua_pcre_malloc_done (old_pool=0x0) at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
#1  0x000000aaaac5cb04 in ngx_http_lua_run_thread (L=0xfff76d0378, r=0xaaaadf16f0, ctx=0xaaaadf23e0,
    nrets=1) at ../ngx_lua-0.10.17/src/ngx_http_lua_util.c:1071
#2  0x000000aaaac7eb44 in ngx_http_lua_socket_tcp_resume_helper (r=0xaaaadf16f0, socket_op=1)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:5955
#3  0x000000aaaac7e874 in ngx_http_lua_socket_tcp_read_resume (r=0xaaaadf16f0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:5868
#4  0x000000aaaac656e8 in ngx_http_lua_content_wev_handler (r=0xaaaadf16f0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_contentby.c:150
#5  0x000000aaaac78ad0 in ngx_http_lua_socket_handle_read_success (r=0xaaaadf16f0, u=0xfff7699170)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:3457
#6  0x000000aaaac760a4 in ngx_http_lua_socket_tcp_read (r=0xaaaadf16f0, u=0xfff7699170)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:2471
#7  0x000000aaaac78410 in ngx_http_lua_socket_read_handler (r=0xaaaadf16f0, u=0xfff7699170)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:3260
#8  0x000000aaaac78248 in ngx_http_lua_socket_tcp_handler (ev=0xaaaae04a70)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_socket_tcp.c:3211
#9  0x000000aaaab3ac6c in ngx_epoll_process_events (cycle=0xaaaadce3c0, timer=60000, flags=1)
    at src/event/modules/ngx_epoll_module.c:901
#10 0x000000aaaab248cc in ngx_process_events_and_timers (cycle=0xaaaadce3c0) at src/event/ngx_event.c:257
#11 0x000000aaaab35d3c in ngx_single_process_cycle (cycle=0xaaaadce3c0)
    at src/os/unix/ngx_process_cycle.c:333
#12 0x000000aaaaade278 in main (argc=7, argv=0xffffff3b68) at src/core/nginx.c:382
agentzh commented 4 years ago

@274914765 The crash site looks weird. Will you try the gdb command disas too and provide the output here?

agentzh commented 4 years ago

Oh, and the gdb command info reg too.

Let's see what's happening on the machine instruction and register level.

honghaier250 commented 4 years ago
[root@localhost openresty-1.17.8.2]# gdb --args /kssl/TRP/bin/openresty -p /kssl/TRP/nginx/ -c conf/nginx.conf -g "daemon off; master_process off;"
GNU gdb (GDB) Fedora 7.8.1-31.fc21.loongson
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64el-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /kssl/TRP/bin/openresty...done.
(gdb) b ngx_http_lua_pcre_malloc_done
Breakpoint 1 at 0x1201bff54: file ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c, line 94.
(gdb) r
Starting program: /kssl/TRP/bin/openresty -p /kssl/TRP/nginx/ -c conf/nginx.conf -g daemon\ off\;\ master_process\ off\;
Missing separate debuginfos, use: debuginfo-install glibc-2.20-15.fc21.loongson.10.mips64el
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120372320)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
Missing separate debuginfos, use: debuginfo-install libgcc-4.9.3-11.fc21.loongson.11.mips64el nss-softokn-freebl-3.23.0-1.0.fc21.loongson.mips64el pcre-8.35-7.fc21.loongson.mips64el zlib-1.2.8-9.fc21.loongson.2.mips64el
(gdb) disas
Dump of assembler code for function ngx_http_lua_pcre_malloc_done:
   0x00000001201bff38 <+0>:     daddiu  sp,sp,-32
   0x00000001201bff3c <+4>:     sd      s8,24(sp)
   0x00000001201bff40 <+8>:     move    s8,sp
   0x00000001201bff44 <+12>:    lui     a1,0x14
   0x00000001201bff48 <+16>:    daddu   a1,a1,t9
   0x00000001201bff4c <+20>:    daddiu  a1,a1,-31016
   0x00000001201bff50 <+24>:    sd      a0,0(s8)
=> 0x00000001201bff54 <+28>:    ld      v0,-30592(a1)
   0x00000001201bff58 <+32>:    ld      v1,0(s8)
   0x00000001201bff5c <+36>:    sd      v1,28096(v0)
   0x00000001201bff60 <+40>:    ld      v0,0(s8)
   0x00000001201bff64 <+44>:    bnez    v0,0x1201bff8c <ngx_http_lua_pcre_malloc_done+84>
   0x00000001201bff68 <+48>:    nop
   0x00000001201bff6c <+52>:    ld      v0,-30592(a1)
   0x00000001201bff70 <+56>:    ld      v1,28104(v0)
   0x00000001201bff74 <+60>:    ld      v0,-21944(a1)
   0x00000001201bff78 <+64>:    sd      v1,0(v0)
   0x00000001201bff7c <+68>:    ld      v0,-30592(a1)
   0x00000001201bff80 <+72>:    ld      v1,28112(v0)
   0x00000001201bff84 <+76>:    ld      v0,-21592(a1)
   0x00000001201bff88 <+80>:    sd      v1,0(v0)
   0x00000001201bff8c <+84>:    move    sp,s8
   0x00000001201bff90 <+88>:    ld      s8,24(sp)
   0x00000001201bff94 <+92>:    daddiu  sp,sp,32
   0x00000001201bff98 <+96>:    jr      ra
   0x00000001201bff9c <+100>:   nop
End of assembler dump.
(gdb) p $a1
$1 = 4834952720
(gdb) p *($a1-30592)
$2 = 540016640
(gdb) p $gp
$3 = 4834952720
(gdb) info reg
                  zero               at               v0               v1
 R0   0000000000000000 00000000ffffffff 00000001201bff38 0000000000000076
                    a0               a1               a2               a3
 R4   0000000120372320 00000001202f8610 0000000000000004 0000000000000076
                    a4               a5               a6               a7
 R8   0000000000000000 0000000000000004 0000000000000077 0000000000000076
                    t0               t1               t2               t3
 R12  000000fff7a1c020 000000ffffff8f20 000000000000000f 000000000000003f
                    s0               s1               s2               s3
 R16  000000ffffffa6b0 000000fff75f1820 0000000000000007 ffffffffffffffff
                    s4               s5               s6               s7
 R20  000000fff763d5d0 000000fff7593af0 0000000000000000 0000000000000000
                    t8               t9               k0               k1
 R24  000000000000003f 00000001201bff38 000000fff7613e70 0000000000000000
                    gp               sp               s8               ra
 R28  00000001202f8610 000000ffffffa520 000000ffffffa520 00000001201a4264
                status               lo               hi         badvaddr
      ffffffff8400ccf3 000000000000007a 0000000000000000 000000fff79f33fb
                 cause               pc
      0000000010000024 00000001201bff54
                  fcsr              fir          restart
      0000000002800044 0000000000770501 0000000000000000
(gdb) c
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x1203777c0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120377be0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x120343220)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Breakpoint 1, ngx_http_lua_pcre_malloc_done (old_pool=0x0)
    at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb)
Continuing.

Program received signal SIGSEGV, Segmentation fault.
ngx_http_lua_pcre_malloc_done (old_pool=0x0) at ../ngx_lua-0.10.17/src/ngx_http_lua_pcrefix.c:94
94          ngx_http_lua_pcre_pool = old_pool;
(gdb) disas
Dump of assembler code for function ngx_http_lua_pcre_malloc_done:
   0x00000001201bff38 <+0>:     daddiu  sp,sp,-32
   0x00000001201bff3c <+4>:     sd      s8,24(sp)
   0x00000001201bff40 <+8>:     move    s8,sp
   0x00000001201bff44 <+12>:    lui     a1,0x14
   0x00000001201bff48 <+16>:    daddu   a1,a1,t9
   0x00000001201bff4c <+20>:    daddiu  a1,a1,-31016
   0x00000001201bff50 <+24>:    sd      a0,0(s8)
=> 0x00000001201bff54 <+28>:    ld      v0,-30592(a1)
   0x00000001201bff58 <+32>:    ld      v1,0(s8)
   0x00000001201bff5c <+36>:    sd      v1,28096(v0)
   0x00000001201bff60 <+40>:    ld      v0,0(s8)
   0x00000001201bff64 <+44>:    bnez    v0,0x1201bff8c <ngx_http_lua_pcre_malloc_done+84>
   0x00000001201bff68 <+48>:    nop
   0x00000001201bff6c <+52>:    ld      v0,-30592(a1)
   0x00000001201bff70 <+56>:    ld      v1,28104(v0)
   0x00000001201bff74 <+60>:    ld      v0,-21944(a1)
   0x00000001201bff78 <+64>:    sd      v1,0(v0)
   0x00000001201bff7c <+68>:    ld      v0,-30592(a1)
   0x00000001201bff80 <+72>:    ld      v1,28112(v0)
   0x00000001201bff84 <+76>:    ld      v0,-21592(a1)
   0x00000001201bff88 <+80>:    sd      v1,0(v0)
   0x00000001201bff8c <+84>:    move    sp,s8
   0x00000001201bff90 <+88>:    ld      s8,24(sp)
   0x00000001201bff94 <+92>:    daddiu  sp,sp,32
   0x00000001201bff98 <+96>:    jr      ra
   0x00000001201bff9c <+100>:   nop
End of assembler dump.
(gdb) p $a1
$4 = 1279704
(gdb) p *($a1-30592)
Cannot access memory at address 0x130f58
(gdb) p $gp
$5 = 1099373047808
(gdb) info reg
                  zero               at               v0               v1
 R0   0000000000000000 0000000000000000 0000000000000000 0000000000000000
                    a0               a1               a2               a3
 R4   0000000000000000 00000000001386d8 000000fff762c3e0 000000fff75f4988
                    a4               a5               a6               a7
 R8   000000fff75f4978 000000fff75fee40 0000000000000000 000000000000145f
                    t0               t1               t2               t3
 R12  fffffffffffffffe 000000fff762c3e0 000000fff75f4f48 0000000000000000
                    s0               s1               s2               s3
 R16  0000000000000001 00000001202849d0 0000000000000000 0000000000000000
                    s4               s5               s6               s7
 R20  00000001201673e0 0000000120157f60 0000000000000000 0000000000000000
                    t8               t9               k0               k1
 R24  0000000000000000 0000000000000000 ffffffffffff0000 0000000000000000
                    gp               sp               s8               ra
 R28  000000fff7bd7000 000000ffffffab00 000000ffffffab00 00000001201b2170
                status               lo               hi         badvaddr
      ffffffff8400ccf3 000000000000000b 0000000002d929e9 0000000000130f58
                 cause               pc
      0000000010000008 00000001201bff54
                  fcsr              fir          restart
      0000000000800044 0000000000770501 0000000000000000
agentzh commented 4 years ago

@siddhesh Are you familiar with mips64? Will you please shed some light on this? Many thanks!

siddhesh commented 4 years ago

I'm afraid I've never looked at MIPS. The backtrace doesn't look like it has anything to do with luajit unless there is code in there that affects the ngx_http_lua_pcre_malloc_done call.

In any case if there is reason to believe that luajit2 is at fault, it may be worthwhile to verify with the latest merged code to see if the problem can be reproduced with it.

Siddhesh

agentzh commented 4 years ago

@siddhesh Okay, thanks for your reply!

@274914765 Will you try the latest v2.1-agentzh branch of this luajit2 repo on your side?

honghaier250 commented 4 years ago

@agentzh I already tried the latest v2.1-agentzh branch of this luajit2, it is also segment fault

honghaier250 commented 4 years ago

Is there any way to solve this problem ? @agentzh