arut / nginx-rtmp-module

NGINX-based Media Streaming Server
http://nginx-rtmp.blogspot.com
BSD 2-Clause "Simplified" License
13.36k stars 3.51k forks source link

nginx 1.7.7 - nginx-rtmp-module crashes. #519

Open DrFlash99 opened 9 years ago

DrFlash99 commented 9 years ago

nginx-1.7.7 nginx-rtmp-module-git (latest as of nov 17) nginx-patches-git (latest as of nov 17) Centos 6.5 (latest patches) & centos 7.0 (latest patches)

My setup:

4 client servers, 16 workers, 1024 connections per worker.

Testing with a big deployment with nginx-rtmp

Patched nginx with the per-worker-listener, within a few minutes after startup different workers crash, they get stuck on 100% cpu and wont stop with a normal kill. Without the per-worker-listener we lose the ability to control or stat the servers.

The following message is shown in the message log at this point:

Nov 17 09:59:00 vidclient4 kernel: nginx[11299]: segfault at d1 ip 0000000000473827 sp 00007fff6a68cad8 error 4 in nginx[400000+bc000]

DrFlash99 commented 9 years ago

Update: this is not related to the per-worker-listener module.

A build without the module still segfaults.

Nov 17 11:05:12 vidclient4 kernel: nginx[14874] general protection ip:473690 sp:7ffffad7e788 error:0 in nginx[400000+bc000] Nov 17 11:07:38 vidclient4 kernel: nginx[14861]: segfault at 208 ip 0000000000473690 sp 00007ffffad7e788 error 4 in nginx[400000+bc000] Nov 17 11:09:26 vidclient4 kernel: nginx[14877] general protection ip:473697 sp:7ffffad7e788 error:0 in nginx[400000+bc000] Nov 17 11:09:53 vidclient4 kernel: nginx[14682]: segfault at d1 ip 0000000000473697 sp 00007ffffad7e728 error 4 in nginx[400000+bc000]

arut commented 9 years ago

Could you post gdb backtrace here?

DrFlash99 commented 9 years ago

and 1 more for good measure :-)

Core was generated by `nginx: worker process '. Program terminated with signal 11, Segmentation fault.

0 0x0000000000496bc9 in ngx_rtmp_relay_close (s=0x1a54288) at ../nginx-rtmp-module/ngx_rtmp_relay_module.c:1367

1367 for (cctx = &ctx->publish->play; _cctx; cctx = &(_cctx)->next, ++n); Missing separate debuginfos, use: debuginfo-install nginx-rtmp-1.2.8-4.el6.x86_64 (gdb) backtrace full

0 0x0000000000496bc9 in ngx_rtmp_relay_close (s=0x1a54288) at ../nginx-rtmp-module/ngx_rtmp_relay_module.c:1367

    n = 2
    racf = 0x11273f0
    ctx = 0x1a55700
    cctx = 0xaf266bfdfa411e67
    hash = <value optimized out>

1 0x0000000000496e42 in ngx_rtmp_relay_close_stream (s=0x1a54288, v=0x6ed420) at ../nginx-rtmp-module/ngx_rtmp_relay_module.c:1421

    racf = <value optimized out>

2 0x0000000000498fac in ngx_rtmp_exec_close_stream (s=0x1a54288, v=0x6ed420) at ../nginx-rtmp-module/ngx_rtmp_exec_module.c:1301

    n = <value optimized out>
    e = <value optimized out>
    ctx = <value optimized out>
    pctx = <value optimized out>
    ppctx = <value optimized out>
    eacf = <value optimized out>

3 0x000000000049c8dc in ngx_rtmp_notify_close_stream (s=0x1a54288, v=0x6ed420) at ../nginx-rtmp-module/ngx_rtmp_notify_module.c:1480

    ctx = <value optimized out>
    nacf = <value optimized out>

4 0x00000000004a00fc in ngx_rtmp_hls_close_stream (s=0x1a54288, v=0x6ed420) at ../nginx-rtmp-module/hls/ngx_rtmp_hls_module.c:1497

    hacf = <value optimized out>
    ctx = <value optimized out>

5 0x00000000004a40d1 in ngx_rtmp_dash_close_stream (s=0x1a54288, v=0x6ed420) at ../nginx-rtmp-module/dash/ngx_rtmp_dash_module.c:986

    ctx = <value optimized out>
    dacf = <value optimized out>

6 0x000000000048489f in ngx_rtmp_cmd_close_stream_init (s=0x1a54288, h=, in=)

at ../nginx-rtmp-module/ngx_rtmp_cmd_module.c:414
    v = {stream = 0}
    in_elts = {{type = 0, name = {len = 0, data = 0x0}, data = 0x6ed420, len = 0}}

7 0x000000000048304b in ngx_rtmp_amf_message_handler (s=0x1a54288, h=0x1a55340, in=0x1a56240) at ../nginx-rtmp-module/ngx_rtmp_receive.c:437

    act = {link = 0x1a56240, first = 0x0, offset = 14, alloc = 0, arg = 0x0, log = 0x1a54230}
    cmcf = <value optimized out>
    ch = 0x1128678
    ph = <value optimized out>
    len = <value optimized out>
    n = <value optimized out>
    func = "closestream", '\000' <repeats 116 times>

---Type to continue, or q to quit--- elts = {{type = 2, name = {len = 0, data = 0x0}, data = 0x6ecd20, len = 128}}

8 0x000000000047f0d8 in ngx_rtmp_receive_message (s=0x1a54288, h=0x1a55340, in=0x1a56240) at ../nginx-rtmp-module/ngx_rtmp_handler.c:799

    cmcf = 0x1124328
    evhs = <value optimized out>
    n = <value optimized out>
    evh = <value optimized out>

9 0x000000000047fe39 in ngx_rtmp_recv (rev=) at ../nginx-rtmp-module/ngx_rtmp_handler.c:464

    n = <value optimized out>
    c = 0x7f0046650100
    s = 0x1a54288
    cscf = 0x1126a20
    h = 0x1a55340
    st = 0x1a55340
    st0 = <value optimized out>
    in = 0x19
    head = 0x1a56240
    b = 0x1a56250
    p = <value optimized out>
    pp = <value optimized out>
    old_pos = 0x1a562c1 "382779:79dac8185ce61813c28bcfba9593e809H"
    size = 25
    fsize = 25
    old_size = 0
    fmt = <value optimized out>
    ext = <value optimized out>
    csid = <value optimized out>
    timestamp = 9891

10 0x0000000000427747 in ngx_epoll_process_events (cycle=0x10fd6d0, timer=, flags=)

at src/event/modules/ngx_epoll_module.c:685
    events = 2
    revents = 1
    instance = <value optimized out>
    i = <value optimized out>
    level = <value optimized out>
    err = <value optimized out>
    rev = 0x7f00466197a0

---Type to continue, or q to quit--- wev = queue = c = 0x7f0046650100

11 0x000000000041ef13 in ngx_process_events_and_timers (cycle=0x10fd6d0) at src/event/ngx_event.c:248

    flags = 1
    timer = 990
    delta = 1416222340316

12 0x0000000000425fa0 in ngx_worker_process_cycle (cycle=0x10fd6d0, data=) at src/os/unix/ngx_process_cycle.c:822

    worker = <value optimized out>
    i = <value optimized out>
    c = <value optimized out>

13 0x000000000042465c in ngx_spawn_process (cycle=0x10fd6d0, proc=0x425eaa , data=0x6, name=0x4ab00b "worker process", respawn=6)

at src/os/unix/ngx_process.c:198
    on = 1
    pid = 0
    s = 6

14 0x0000000000426a31 in ngx_reap_children (cycle=0x10fd6d0) at src/os/unix/ngx_process_cycle.c:631

    i = <value optimized out>
    live = <value optimized out>
    n = <value optimized out>
    ch = {command = 2, pid = 18510, slot = 6, fd = -1}
    ccf = <value optimized out>

15 ngx_master_process_cycle (cycle=0x10fd6d0) at src/os/unix/ngx_process_cycle.c:184

    title = <value optimized out>
    p = <value optimized out>
    size = <value optimized out>
    i = <value optimized out>
    n = <value optimized out>
    sigio = 0
    set = {__val = {0 <repeats 16 times>}}
    itv = {it_interval = {tv_sec = 17820345, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}
    live = <value optimized out>
    delay = 0
    ls = <value optimized out>
    ccf = 0x10fe7a0

16 0x000000000040822b in main (argc=, argv=) at src/core/nginx.c:407

---Type to continue, or q to quit--- i = log = 0x6e9fe0 cycle = 0x10fd6d0 init_cycle = {conf_ctx = 0x0, pool = 0x10fd110, log = 0x6e9fe0, new_log = {log_level = 0, file = 0x0, connection = 0, handler = 0, data = 0x0, writer = 0, wdata = 0x0, action = 0x0, next = 0x0}, log_use_stderr = 0, files = 0x0, free_connections = 0x0, free_connection_n = 0, reusable_connections_queue = { prev = 0x0, next = 0x0}, listening = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, paths = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, open_files = {last = 0x0, part = {elts = 0x0, nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0}, shared_memory = {last = 0x0, part = { elts = 0x0, nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0}, connection_n = 0, files_n = 0, connections = 0x0, read_events = 0x0, write_events = 0x0, old_cycle = 0x0, conf_file = {len = 21, data = 0x7ffff101ff73 "ss"}, conf_param = {len = 0, data = 0x0}, conf_prefix = {len = 11, data = 0x7ffff101ff73 "ss"}, prefix = {len = 5, data = 0x4a69d9 "/usr/"}, lock_file = {len = 0, data = 0x0}, hostname = {len = 0, data = 0x0}} ccf =

developer222 commented 9 years ago

Hi,

I also have the same problem, the gdb backtrace is here in the last message:

https://github.com/arut/nginx-rtmp-module/issues/470

nginx 1.7.5

arut commented 9 years ago

Do you guys have the latest code?

developer222 commented 9 years ago

Yes, latest from master, also tried compiling with nginx 1.7.7.

DrFlash99 commented 9 years ago

Yes same for me, latest git code + nginx 1.7.7

wdjwxh commented 9 years ago

same for me ...

mknwebsolutions commented 9 years ago

This issue has also occured on my master server (1 master 3 relays)

wdjwxh commented 9 years ago

i found it only occured when rtmp_auto_push and pull_static used in the same time...

mknwebsolutions commented 9 years ago

@wdjwxh i have neither of those values and mine was crashing. Until today, I disabled idle_streams and so far no crashes.

DrFlash99 commented 9 years ago

@wdjwxh @mknwebsolutions I found this problem occurs with any use of the PULL mechanism, static or dynamic, at some point during use it will crash when a pulling stream dissapears.

jiakai1000 commented 9 years ago

Do you know how to reproduce the crash quickly? Please tell me the detailed step.(config file, operations, etc.)

wdjwxh commented 9 years ago

as above ,i found it only carsh when rtmp_auto_push and pull_static used in the same time... so I delete the rtmp_auto_push so it won't crash. but , the used bandwidth increased 100% even more...

SharkyRawr commented 8 years ago

I believe I also encountered this bug:

Program received signal SIGSEGV, Segmentation fault.
0x00007feabe2bff28 in ngx_rtmp_auto_push_publish (s=0x7feabf74eca0, v=0x7feabe545700 <v>)
    at ../nginx-rtmp-module/ngx_rtmp_auto_push_module.c:477
477         ctx->push_evt.log = s->connection->log;
(gdb) bt
#0  0x00007feabe2bff28 in ngx_rtmp_auto_push_publish (s=0x7feabf74eca0, v=0x7feabe545700 <v>)
    at ../nginx-rtmp-module/ngx_rtmp_auto_push_module.c:477
#1  0x00007feabe2aad66 in ngx_rtmp_amf_message_handler (s=0x7feabf74eca0, h=0x7feabf812830, in=0x7feabfad59d8)
    at ../nginx-rtmp-module/ngx_rtmp_receive.c:437
#2  0x00007feabe2a6631 in ngx_rtmp_receive_message (s=s@entry=0x7feabf74eca0, h=h@entry=0x7feabf812830,
    in=in@entry=0x7feabf8147e0) at ../nginx-rtmp-module/ngx_rtmp_handler.c:799
#3  0x00007feabe2a6bc1 in ngx_rtmp_recv (rev=<optimized out>) at ../nginx-rtmp-module/ngx_rtmp_handler.c:464
#4  0x00007feabe223daf in ngx_epoll_process_events (cycle=0x7feabf73fb60, timer=0, flags=1)
    at src/event/modules/ngx_epoll_module.c:822
#5  0x00007feabe21942a in ngx_process_events_and_timers (cycle=cycle@entry=0x7feabf73fb60) at src/event/ngx_event.c:242
#6  0x00007feabe221475 in ngx_worker_process_cycle (cycle=cycle@entry=0x7feabf73fb60, data=data@entry=0x0)
    at src/os/unix/ngx_process_cycle.c:753
#7  0x00007feabe21fdca in ngx_spawn_process (cycle=0x7feabf73fb60, proc=0x7feabe221420 <ngx_worker_process_cycle>, data=0x0,
    name=0x7feabe2d4ed2 "worker process", respawn=0) at src/os/unix/ngx_process.c:198
#8  0x00007feabe222b92 in ngx_reap_children (cycle=<optimized out>) at src/os/unix/ngx_process_cycle.c:621
#9  ngx_master_process_cycle (cycle=0x7feabf73fb60) at src/os/unix/ngx_process_cycle.c:174
#10 0x00007feabe1f9c36 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:367
(gdb) print ctx
$1 = (ngx_rtmp_auto_push_ctx_t *) 0x7feabf74eca0
(gdb) print ctx->push_evt
$2 = {data = 0x7feabf74eca0, write = 0, accept = 0, instance = 0, active = 0, disabled = 0, ready = 0, oneshot = 0,
  complete = 0, eof = 0, error = 0, timedout = 0, timer_set = 0, delayed = 0, deferred_accept = 0, pending_eof = 0,
  posted = 0, closed = 0, channel = 0, resolver = 0, cancelable = 0, available = 0, handler = 0x0, index = 0, log = 0x0,
  timer = {key = 0, left = 0x0, right = 0x0, parent = 0x0, color = 0 '\000', data = 0 '\000'}, queue = {prev = 0x0,
    next = 0x0}}
(gdb) print ctx->push_evt.log
$3 = (ngx_log_t *) 0x0
(gdb) print s
$4 = (ngx_rtmp_session_t *) 0x7feabf74eca0
(gdb) print s->connection
$5 = (ngx_connection_t *) 0x0 <- !!

s->connection appears to be NULL, trying to access its member logcauses the segfault.

Running on Debian/jessie:

nginx version: nginx/1.9.15 built with OpenSSL 1.0.1k 8 Jan 2015

nginx-rtmp-module# git branch -v
* master e089592 support for dynamic build

Crash does NOT happen if rtmp_auto_push is set to off.

misiek08 commented 8 years ago

Worker number needs to be 1, and auto_push needs to be disabled. They are experimental features and as you see - they doesn't work now. If you really need so many connected clients (I don't think so), then you can setup multiple instances and manually set push directives for distributing streams across them.

DrFlash99 commented 7 years ago

Been a while since I was here, but yes I do need so many clients :)

have hit over 20k on my live env (wowza currently)

misiek08 commented 7 years ago

@DrFlash99 Then you need to cluster itself via pull or push (it's up to you - you need to understand your use case). I scaled on single server to 8Gbps, so around 15k clients without any problems. I used many separate nginx instances, load balancer and pull's.

mix1music commented 3 years ago

@DrFlash99 Then you need to cluster itself via pull or push (it's up to you - you need to understand your use case). I scaled on single server to 8Gbps, so around 15k clients without any problems. I used many separate nginx instances, load balancer and pull's.

Can you simplify how load Balance works and what is installed on the back-end servers?