Open gmarcz opened 10 years ago
I made some progress with gdb:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000484aab in ngx_rtmp_relay_create (s=s@entry=0x18bacb8, name=name@entry=0x7fff044956a0, target=target@entry=0x7fff04495870,
create_publish_ctx=create_publish_ctx@entry=0x484473
create_publish_ctx=create_publish_ctx@entry=0x484473 <ngx_rtmp_relay_create_local_ctx>,
create_play_ctx=create_play_ctx@entry=0x4849e9 <ngx_rtmp_relay_create_remote_ctx>) at nginx-rtmp-module/ngx_rtmp_relay_module.c:593
at nginx-rtmp-module/ngx_rtmp_relay_module.c:645
respawn=respawn@entry=0) at src/os/unix/ngx_process.c:198
(gdb) l 588 in nginx-rtmp-module/ngx_rtmp_relay_module.c
which nginx version ?
/usr/local/nginx/sbin/nginx -V nginx version: nginx/1.6.0 built by gcc 4.8.2 (Ubuntu 4.8.2-19ubuntu1) TLS SNI support enabled configure arguments: --prefix=/usr/local/nginx --add-module=nginx-rtmp-module --with-http_ssl_module
Got the same shit on 1.7.5
I am using 1.7.5, everything seems OK!
Same problem on 1.7.5 , OS: Centos 6
The problem seems to occur with random static cam like these: pull rtmp://video2.earthcam.com/fecnetwork/ name=earthcam5868 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf?20120504_a playPath=5868.flv static; pull rtmp://video2.earthcam.com/fecnetwork/ name=earthcam4516 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf?20120504_a playPath=4516.flv static;
Just FYI. See also comments in #472.
I have recompiled nginx with debug option to get some more info:
2014/09/19 12:31:13 [debug] 2503#0: posted event 0000000000000000 2014/09/19 12:31:13 [debug] 2503#0: worker cycle 2014/09/19 12:31:13 [debug] 2503#0: accept mutex locked 2014/09/19 12:31:13 [debug] 2503#0: epoll timer: 52663 2014/09/19 12:31:13 [debug] 2503#0: epoll: fd:130 ev:0001 d:00007FB656F29979 2014/09/19 12:31:13 [debug] 2503#0: 953 post event 0000000001046A38 2014/09/19 12:31:13 [debug] 2503#0: timer delta: 1 2014/09/19 12:31:13 [debug] 2503#0: posted events 0000000001046A38 2014/09/19 12:31:13 [debug] 2503#0: posted event 0000000001046A38 2014/09/19 12:31:13 [debug] 2503#0: 953 delete posted event 0000000001046A38 2014/09/19 12:31:13 [debug] 2503#0: 953 recv: fd:130 41 of 530 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP bheader fmt=0 csid=3 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP mheader fmt=0 amf_cmd (20) time=0+0 mlen=29 len=0 msid=0 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP recv amf_cmd (20) csid=3 timestamp=0 mlen=29 msid=0 nbufs=1 2014/09/19 12:31:13 [debug] 2503#0: 953 nhandlers: 1 2014/09/19 12:31:13 [debug] 2503#0: 953 calling handler 0 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (1) 02 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (2) 00 07 '??' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (7) 5F 72 65 73 75 6C 74 '_result' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF func '_result' passed to handler 0/1 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (1) 00 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (8) 40 00 00 00 00 00 00 00 '@???????' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (1) 05 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (1) 00 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF read (8) 3F F0 00 00 00 00 00 00 '????????' 2014/09/19 12:31:13 [debug] 2503#0: 953 relay: _result: level='' code='' description='' 2014/09/19 12:31:13 [debug] 2503#0: 953 create: amf nelts=6 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 02 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (2) 00 04 '??' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (4) 70 6C 61 79 'play' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 00 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (8) 00 00 00 00 00 00 00 00 '????????' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 05 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 02 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (2) 00 13 '??' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (19) 34 30 31 37 74 69 6D 65 73 73 71 75 61 72 65 2E '4017timessquare.' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 00 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (8) C0 9F 40 00 00 00 00 00 '??@?????' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (1) 00 '?' 2014/09/19 12:31:13 [debug] 2503#0: 953 AMF write (8) C0 8F 40 00 00 00 00 00 '??@?????' 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP prep amf_cmd (20) fmt=0 csid=5 timestamp=0 mlen=57 msid=1 nbufs=1 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP send nmsg=1, priority=0 #5 2014/09/19 12:31:13 [debug] 2503#0: 953 send: fd:130 69 of 69 2014/09/19 12:31:13 [debug] 2503#0: posted event 0000000000000000 2014/09/19 12:31:13 [debug] 2503#0: 953 create: set_buflen msid=1 buflen=5000 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP prep user (4) fmt=0 csid=2 timestamp=0 mlen=10 msid=0 nbufs=1 2014/09/19 12:31:13 [debug] 2503#0: 953 RTMP send nmsg=1, priority=0 #6 2014/09/19 12:31:13 [debug] 2503#0: 953 send: fd:130 22 of 22 2014/09/19 12:31:13 [debug] 2503#0: posted event 0000000000000000 2014/09/19 12:31:13 [debug] 2503#0: malloc: 00000000013C6BA0:8192 2014/09/19 12:31:13 [debug] 2503#0: 953 auto_push: reconnect 2014/09/19 12:31:13 [debug] 2503#0: 953 auto_push: connect slot=1 pid=26 socket='unix:/tmp/nginx-rtmp.1' name='09rlDGOUzS7DkeX' 2014/09/19 12:31:13 [info] 2503#0: 953 relay: create push name='09rlDGOUzS7DkeX' app='' playpath='' url='unix:/tmp/nginx-rtmp.1', client: video2.earthcam.com/fecnetwork/, server: ngx-relay 2014/09/19 12:31:13 [debug] 2503#0: relay: create remote context 2014/09/19 12:31:13 [debug] 2503#0: posix_memalign: 00000000013C4B80:4096 @16 2014/09/19 12:31:13 [debug] 2503#0: socket 133 2014/09/19 12:31:13 [debug] 2503#0: epoll add connection: fd:133 ev:80002005 2014/09/19 12:31:13 [debug] 2503#0: connect to unix:/tmp/nginx-rtmp.1, fd:133 #955 2014/09/19 12:31:13 [debug] 2503#0: connected 2014/09/19 12:31:13 [debug] 2503#0: posix_memalign: 00000000013C8BB0:4096 @16 2014/09/19 12:31:13 [debug] 2503#0: 955 setting chunk_size=128 2014/09/19 12:31:13 [debug] 2503#0: 955 posix_memalign: 00000000013C9BC0:4096 @16 2014/09/19 12:31:13 [debug] 2503#0: 955 handshake: start client handshake 2014/09/19 12:31:13 [debug] 2503#0: 955 handshake: allocating buffer 2014/09/19 12:31:13 [debug] 2503#0: *955 event timer add: 133: 60000:1411129933565
Please update the code from master. It should be fixed now.
I rebuild the server with nginx 1.6.2 and git://github.com/arut/nginx-rtmp-module.git But I keep getting these errors: 2014/09/21 12:12:45 [alert] 25#0: worker process 22872 exited on signal 11 2014/09/21 12:12:49 [alert] 25#0: worker process 23490 exited on signal 11 2014/09/21 12:12:52 [alert] 25#0: worker process 24065 exited on signal 11 2014/09/21 12:12:56 [alert] 25#0: worker process 24666 exited on signal 11 2014/09/21 12:12:59 [alert] 25#0: worker process 25271 exited on signal 11 2014/09/21 12:13:02 [alert] 25#0: worker process 25888 exited on signal 11 2014/09/21 12:13:06 [alert] 25#0: worker process 26506 exited on signal 11 2014/09/21 12:13:09 [alert] 25#0: worker process 27117 exited on signal 11 2014/09/21 12:13:12 [alert] 25#0: worker process 27697 exited on signal 11 2014/09/21 12:13:16 [alert] 25#0: worker process 28313 exited on signal 11 2014/09/21 12:13:19 [alert] 25#0: worker process 28931 exited on signal 11 2014/09/21 12:13:22 [alert] 25#0: worker process 29548 exited on signal 11
@arut Awesome! Just tested, works like a charm in both scenarios described above (HTTP only and with RTMP config). Thanks again a ton for fixing this so quickly.
The config what I use for testing and, still get these errors:
worker_processes 2;
error_log logs/error.log;
pid logs/nginx.pid;
rtmp_auto_push on;
events {
worker_connections 1024;
}
rtmp {
server {
listen 1935;
application cam {
live on;
drop_idle_publisher 120s;
wait_video on;
wait_key on;
pull rtmp://video2.earthcam.com/fecnetwork/ name=09rlDGOUzS7DkeX swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4017timessquare.flv static;
pull rtmp://srv13.arkasis.nl:80/480/default.stream name=14RZ9n7F8fyNrOe static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=1bqZDOjhySLkRya swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=3847fairmont.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=1z4d63JubSYW7V2 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5097.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=504kEWburFgXp2w swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=miamivishd5.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=5zpZ1gyI3UX9EQj swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5204.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=67X4lXmuotgwzvV swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5019.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=8W7PjZvt4U3j5Lq swfUrl=http://earthcam.com/swf/cam_player_v2/abbeyroad_player.swf playPath=AbbeyRoadHD1.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=8eVmoxNU2Id5NA6 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=6649.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=Eoel4rNSnHR4yvX swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4798.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=EvwVlDWi0tr5mD8 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5868.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=GznNkaRT9tj3AeX swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4098.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=N06eAEWuZIBpMDO swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5096.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=NY9mv5oFLUxeAPW swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=moscowHD1.flv static;
pull rtmp://149.3.141.18/mila/ name=OkNVQ3yIvuGzP1Z playPath=bardarbunga2.stream static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=Q0aDWx2C9iq3nlw swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4559.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=YD9r2bOipF9Q53J swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=5023.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=Zmq8E4zsACg0zak swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4516.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=a85eQWltrilxz2X swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4931.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=dq6w2LdsYcQ2EzY swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4831.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=eD1YbZ4u0idnqVv swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4369.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=nwAEVd3SaU3MakN swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4282.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=o9V1AzOSbINxrZn swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=lacitytours1.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=ogewqKbCdTG95j6 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=3724trailnet2.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=pDeKa1yTDFMaMNB swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=eclasvegas.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=rPo8ZxGU4H5kjO4 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4717.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=rpkld7jTZf2KP0Q swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=tokyo1.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=vGzAKnEULFNpeR9 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=hotelvictoria2.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=vwVanbas1fblZ17 swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4135.flv static;
pull rtmp://video2.earthcam.com/fecnetwork/ name=w1mRNp4tYcPNDVP swfUrl=http://www.earthcam.com/swf/cam_player_v2/ecnPlayer.swf playPath=4054.flv static;
}
}
}
http {
server_tokens off;
include mime.types;
default_type application/octet-stream;
access_log logs/access.log;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65s;
gzip on;
server {
listen 80;
access_log logs/host.access.log;
# rtmp statistics
location /stat {
rtmp_stat all;
rtmp_stat_stylesheet stat.xsl;
}
location /stat.xsl {
# you can move stat.xsl to a different location.
root /stat;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
Without the pull section I don't get these errors, but with the static cams the worker process stops approximately in every 10 min.
@gmarcz what if you recompile with nginx 1.7.5?
@gmarcz @arut I quickly tested the above configuration with nginx 1.7.5 and nginx-rtmp-module from master. No errors here.
Strange because I got the same errors with 1.7.5
nginx version: nginx/1.7.5 built by gcc 4.8.2 (Ubuntu 4.8.2-19ubuntu1) TLS SNI support enabled configure arguments: --prefix=/usr/local/nginx --add-module=nginx-rtmp-module --with-http_ssl_module
2014/09/22 13:23:19 [alert] 24#0: worker process 3841 exited on signal 11 2014/09/22 13:26:55 [alert] 24#0: worker process 7830 exited on signal 11 2014/09/22 13:37:00 [alert] 24#0: worker process 7836 exited on signal 11 2014/09/22 13:53:19 [alert] 24#0: worker process 7844 exited on signal 11
Program received signal SIGSEGV, Segmentation fault.
0x00000000004883ee in ngx_rtmp_relay_create (s=s@entry=0x16953d0, name=name@entry=0x7fff69c81bb0, target=target@entry=0x7fff69c81d80,
create_publish_ctx=create_publish_ctx@entry=0x487db1
create_publish_ctx=create_publish_ctx@entry=0x487db1 <ngx_rtmp_relay_create_local_ctx>,
create_play_ctx=create_play_ctx@entry=0x48832c <ngx_rtmp_relay_create_remote_ctx>) at nginx-rtmp-module/ngx_rtmp_relay_module.c:593
at nginx-rtmp-module/ngx_rtmp_relay_module.c:645
respawn=respawn@entry=0) at src/os/unix/ngx_process.c:198
(gdb) l 588 } 589 590 hash = ngx_hash_key(name->data, name->len); 591 cctx = &racf->ctx[hash % racf->nbuckets]; 592 for (; *cctx; cctx = &(_cctx)->next) { 593 if ((_cctx)->name.len == name->len 594 && !ngx_memcmp(name->data, (_cctx)->name.data, 595 name->len)) 596 { 597 break; (gdb)
@gmarcz please set up debug logging and post more lines from error.log
@gmarcz Didn't wait long enough during my tests... I get the signal 11 error too after about 20min.
These are the last few lines from the exiting process:
2014/09/22 15:43:00 [debug] 353#0: 15 nhandlers: 5 2014/09/22 15:43:00 [debug] 353#0: 15 calling handler 0 2014/09/22 15:43:00 [debug] 353#0: 15 calling handler 1 2014/09/22 15:43:00 [debug] 353#0: 15 calling handler 2 2014/09/22 15:43:00 [debug] 353#0: 15 event timer: 32, old: 1411400700169, new: 1411400700175 2014/09/22 15:43:00 [debug] 353#0: 15 live: audio packet timestamp=608510 2014/09/22 15:43:00 [debug] 353#0: 15 RTMP prep audio (8) fmt=1 csid=6 timestamp=64 mlen=507 msid=1 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 44 live: rel audio packet delta=64 2014/09/22 15:43:00 [debug] 353#0: 44 RTMP send nmsg=1, priority=0 #41 2014/09/22 15:43:00 [debug] 353#0: 44 send: fd:55 515 of 515 2014/09/22 15:43:00 [debug] 353#0: 15 calling handler 3 2014/09/22 15:43:00 [debug] 353#0: 15 calling handler 4 2014/09/22 15:43:00 [debug] 353#0: 15 reusing formerly read data: 18 2014/09/22 15:43:00 [debug] 353#0: 15 RTMP bheader fmt=1 csid=7 2014/09/22 15:43:00 [debug] 353#0: 15 RTMP mheader fmt=1 video (9) time=608351+68 mlen=724 len=0 msid=1 2014/09/22 15:43:00 [debug] 353#0: 15 recv: fd:32 -1 of 512 2014/09/22 15:43:00 [debug] 353#0: 15 recv() not ready (11: Resource temporarily unavailable) 2014/09/22 15:43:00 [debug] 353#0: timer delta: 1 2014/09/22 15:43:00 [debug] 353#0: worker cycle 2014/09/22 15:43:00 [debug] 353#0: accept mutex lock failed: 0 2014/09/22 15:43:00 [debug] 353#0: epoll timer: 500 2014/09/22 15:43:00 [debug] 353#0: epoll: fd:44 ev:0001 d:00007F6DEA077888 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 403 of 403 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP recv audio (8) csid=6 timestamp=608456 mlen=507 msid=1 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 27 nhandlers: 5 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 0 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 1 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 2 2014/09/22 15:43:00 [debug] 353#0: 27 event timer: 44, old: 1411400700172, new: 1411400700176 2014/09/22 15:43:00 [debug] 353#0: 27 live: audio packet timestamp=608456 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP prep audio (8) fmt=1 csid=6 timestamp=66 mlen=507 msid=1 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 32 live: rel audio packet delta=66 2014/09/22 15:43:00 [debug] 353#0: 32 RTMP send nmsg=1, priority=0 #222 2014/09/22 15:43:00 [debug] 353#0: 32 send: fd:17 515 of 515 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 3 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 4 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 19 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=1 csid=7 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=1 video (9) time=608311+68 mlen=2233 len=0 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 511 of 511 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 10 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=3 csid=7 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=3 video (9) time=608311+68 mlen=2233 len=512 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 520 of 520 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 17 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=3 csid=7 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=3 video (9) time=608311+68 mlen=2233 len=1024 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 513 of 513 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 17 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=3 csid=7 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=3 video (9) time=608311+68 mlen=2233 len=1536 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 513 of 513 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 17 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=3 csid=7 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=3 video (9) time=608311+68 mlen=2233 len=2048 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 436 of 513 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP recv video (9) csid=7 timestamp=608379 mlen=2233 msid=1 nbufs=5 2014/09/22 15:43:00 [debug] 353#0: 27 nhandlers: 5 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 0 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 1 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 2 2014/09/22 15:43:00 [debug] 353#0: 27 event timer: 44, old: 1411400700172, new: 1411400700176 2014/09/22 15:43:00 [debug] 353#0: 27 live: video packet timestamp=608379 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP prep video (9) fmt=1 csid=7 timestamp=68 mlen=2233 msid=1 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 32 live: rel video packet delta=68 2014/09/22 15:43:00 [debug] 353#0: 32 RTMP send nmsg=1, priority=2 #223 2014/09/22 15:43:00 [debug] 353#0: 32 send: fd:17 2241 of 2241 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 3 2014/09/22 15:43:00 [debug] 353#0: 27 calling handler 4 2014/09/22 15:43:00 [debug] 353#0: 27 reusing formerly read data: 267 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP bheader fmt=2 csid=6 2014/09/22 15:43:00 [debug] 353#0: 27 RTMP mheader fmt=2 audio (8) time=608456+65 mlen=507 len=0 msid=1 2014/09/22 15:43:00 [debug] 353#0: 27 recv: fd:44 -1 of 263 2014/09/22 15:43:00 [debug] 353#0: 27 recv() not ready (11: Resource temporarily unavailable) 2014/09/22 15:43:00 [debug] 353#0: timer delta: 1 2014/09/22 15:43:00 [debug] 353#0: worker cycle 2014/09/22 15:43:00 [debug] 353#0: accept mutex lock failed: 0 2014/09/22 15:43:00 [debug] 353#0: epoll timer: 500 2014/09/22 15:43:00 [debug] 353#0: epoll: fd:29 ev:0001 d:00007F6DEA076BE1 2014/09/22 15:43:00 [debug] 353#0: 91 recv: fd:29 41 of 530 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP bheader fmt=0 csid=3 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP mheader fmt=0 amf_cmd (20) time=0+0 mlen=29 len=0 msid=0 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP recv amf_cmd (20) csid=3 timestamp=0 mlen=29 msid=0 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 91 nhandlers: 1 2014/09/22 15:43:00 [debug] 353#0: 91 calling handler 0 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (1) 02 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (2) 00 07 '??' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (7) 5F 72 65 73 75 6C 74 '_result' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF func '_result' passed to handler 0/1 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (1) 00 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (8) 40 00 00 00 00 00 00 00 '@???????' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (1) 05 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (1) 00 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF read (8) 3F F0 00 00 00 00 00 00 '????????' 2014/09/22 15:43:00 [debug] 353#0: 91 relay: _result: level='' code='' description='' 2014/09/22 15:43:00 [debug] 353#0: 91 create: amf nelts=6 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 02 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (2) 00 04 '??' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (4) 70 6C 61 79 'play' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 00 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (8) 00 00 00 00 00 00 00 00 '????????' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 05 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 02 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (2) 00 08 '??' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (8) 34 32 38 32 2E 66 6C 76 '4282.flv' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 00 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (8) C0 9F 40 00 00 00 00 00 '??@?????' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (1) 00 '?' 2014/09/22 15:43:00 [debug] 353#0: 91 AMF write (8) C0 8F 40 00 00 00 00 00 '??@?????' 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP prep amf_cmd (20) fmt=0 csid=5 timestamp=0 mlen=46 msid=1 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP send nmsg=1, priority=0 #5 2014/09/22 15:43:00 [debug] 353#0: 91 send: fd:29 58 of 58 2014/09/22 15:43:00 [debug] 353#0: 91 create: set_buflen msid=1 buflen=5000 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP prep user (4) fmt=0 csid=2 timestamp=0 mlen=10 msid=0 nbufs=1 2014/09/22 15:43:00 [debug] 353#0: 91 RTMP send nmsg=1, priority=0 #6 2014/09/22 15:43:00 [debug] 353#0: 91 send: fd:29 22 of 22 2014/09/22 15:43:00 [debug] 353#0: malloc: 0000000002AA43A0:8192 2014/09/22 15:43:00 [debug] 353#0: 91 auto_push: reconnect 2014/09/22 15:43:00 [debug] 353#0: 91 auto_push: connect slot=1 pid=354 socket='unix:/tmp/nginx-rtmp.1' name='nwAEVd3SaU3MakN' 2014/09/22 15:43:00 [info] 353#0: 91 relay: create push name='nwAEVd3SaU3MakN' app='' playpath='' url='unix:/tmp/nginx-rtmp.1', client: video2.earthcam.com/fecnetwork/, server: ngx-relay 2014/09/22 15:43:00 [debug] 353#0: relay: create remote context 2014/09/22 15:43:00 [debug] 353#0: posix_memalign: 0000000002A4CE70:4096 @16 2014/09/22 15:43:00 [debug] 353#0: socket 68 2014/09/22 15:43:00 [debug] 353#0: epoll add connection: fd:68 ev:80002005 2014/09/22 15:43:00 [debug] 353#0: connect to unix:/tmp/nginx-rtmp.1, fd:68 #93 2014/09/22 15:43:00 [debug] 353#0: connected 2014/09/22 15:43:00 [debug] 353#0: posix_memalign: 0000000002C35330:4096 @16 2014/09/22 15:43:00 [debug] 353#0: 93 setting chunk_size=128 2014/09/22 15:43:00 [debug] 353#0: 93 posix_memalign: 0000000002C36340:4096 @16 2014/09/22 15:43:00 [debug] 353#0: 93 handshake: start client handshake 2014/09/22 15:43:00 [debug] 353#0: 93 handshake: allocating buffer 2014/09/22 15:43:00 [debug] 353#0: 93 event timer add: 68: 60000:1411400640176
Same problem in 1.7.5
Please try static-relay-cleanup
branch. The problem is in static pull + auto_push combination. On my Linux server it seems to be fixed now.
Thanks everyone in this thread for the reports. I'm waiting for feedback.
I've tested your fix playing 2 random streams from the configuration for 20min+; videos are still playing, no entries in error log. I'll leave them running for longer, but it looks good to me.
My compile configuration (with branch static-relay-cleanup):
nginx version: nginx/1.7.5 built by gcc 4.8.2 (Ubuntu 4.8.2-19ubuntu1) TLS SNI support enabled configure arguments: --add-module=../nginx-rtmp-module --add-module=../ngx_http_geoip2_module --add-module=../ngx_pagespeed --with-http_stub_status_module --with-http_addition_module --with-http_sub_module --with-http_auth_request_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_realip_module --with-http_spdy_module --with-http_flv_module --with-http_mp4_module --with-openssl=../openssl --with-http_ssl_module --with-google_perftools_module --with-http_xslt_module --with-http_random_index_module --with-http_secure_link_module --with-pcre-jit --with-file-aio --with-ipv6 --with-cc-opt='-I ../libmaxminddb/include -O2 -g -pipe -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' --with-ld-opt='-L ../libmaxminddb/src/.libs'
Update: 1h40m and the videos are still streaming with above configuration. I guess this works.
@arut thank you for your effort, I will test it with 1.6.2 and let you know if I found anything.
Server is running about 2h 49m seems to be stable for now.
We have the same problem with pulling, but our pulls aren't static.
2014/09/24 04:13:59 [alert] 15065#0: worker process 29253 exited on signal 11 2014/09/24 04:13:59 [alert] 15065#0: worker process 29274 exited on signal 11 2014/09/24 04:13:59 [alert] 15065#0: worker process 16387 exited on signal 11 2014/09/24 15:44:41 [alert] 15065#0: worker process 16840 exited on signal 11 2014/09/24 15:44:41 [alert] 15065#0: worker process 18055 exited on signal 11 2014/09/24 17:36:54 [alert] 15065#0: worker process 24840 exited on signal 11 2014/09/24 22:20:08 [alert] 15065#0: worker process 29246 exited on signal 11 2014/09/24 22:20:08 [alert] 15065#0: worker process 16841 exited on signal 11
nginx 1.7.5, latest nginx-rtmp version. The static-relay-cleanup didn't help.
It's hard to debug because of heavy traffic. Will the debug log be enough, or I should also run gdb?
@developer222 yes, please report gdb backtrace
Hello,
Core was generated by `nginx: worker process '. Program terminated with signal 11, Segmentation fault.
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_relay_module.c:1367
1367 for (cctx = &ctx->publish->play; _cctx; cctx = &(_cctx)->next, ++n); (gdb) bt full
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_relay_module.c:1367
n = 46
racf = 0x2266ff8
ctx = 0x59058c0
cctx = 0x8776500cbf9e626e
hash = 35911760
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_relay_module.c:1431
No locals.
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_cmd_module.c:705
No locals.
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_notify_module.c:1350
nscf = 0x22628e0
ci = {url = 0x7fe2e609cb10, create = 0x3366a80, filter = 0x2262348, sink = 0x2cb84d0,
handle = 0x2cb84d0, arg = 0x2cb83e0, argsize = 53964160}
url = 0x0
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_cmd_module.c:698
No locals.
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp.c:813
cmcf = 0x2260b60
ch = 0x2260f70
hh = 0x226cdd0
n = 0
at ../nginx-rtmp-module-static-relay-cleanup/ngx_rtmp_init.c:283
s = 0x5904b38
c = 0x7fe2e60b6f40
cscf = 0x2262348
at src/event/ngx_event_posted.c:40
ev = 0x5904b40
flags = 1
timer = 1
delta = 1
at src/os/unix/ngx_process_cycle.c:816
worker = 0
i = 140736937806336
c = 0x0
data=0x0, name=0x52da73 "worker process", respawn=-3) at src/os/unix/ngx_process.c:198
on = 1
pid = 0
s = 0
at src/os/unix/ngx_process_cycle.c:364
i = 0
ch = {command = 1, pid = 0, slot = 0, fd = 0}
title = 0x226d910 "master process /usr/sbin/nginx -c /etc/nginx/nginx.conf"
p = 0x226d947 ""
size = 56
i = 3
n = 0
sigio = 35869112
set = {__val = {0 <repeats 16 times>}}
itv = {it_interval = {tv_sec = 0, tv_usec = 1}, it_value = {tv_sec = 11, tv_usec = 18}}
live = 0
delay = 4233362
ls = 0x5
ccf = 0x2236258
i = 88
log = 0x778f20
cycle = 0x22351a0
init_cycle = {conf_ctx = 0x0, pool = 0x2234be0, log = 0x778f20, new_log = {log_level = 0, file = 0x0,
connection = 0, handler = 0, data = 0x0, action = 0x0, next = 0x0}, log_use_stderr = 0,
files = 0x0, free_connections = 0x0, free_connection_n = 0, reusable_connections_queue = {prev = 0x0,
next = 0x0}, listening = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, paths = {
elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, open_files = {last = 0x0, part = {
elts = 0x0, nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0}, shared_memory = {
last = 0x0, part = {elts = 0x0, nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0},
connection_n = 0, files_n = 0, connections = 0x0, read_events = 0x0, write_events = 0x0,
old_cycle = 0x0, conf_file = {len = 21, data = 0x7fffdf2f5f73 "ss"}, conf_param = {len = 0,
data = 0x0}, conf_prefix = {len = 11, data = 0x7fffdf2f5f73 "ss"}, prefix = {len = 11,
data = 0x529bf0 "/etc/nginx/"}, lock_file = {len = 0, data = 0x0}, hostname = {len = 0, data = 0x0}}
nginx -V nginx version: nginx/1.6.2 built by gcc 4.4.7 20120313 (Red Hat 4.4.7-4) (GCC) TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-mail --with-mail_ssl_module --with-file-aio --with-ipv6 --with-cc-opt='-O0 -g' --with-debug --add-module=../nginx-rtmp-module-static-relay-cleanup
nginx config:
user nginx; worker_processes 1; worker_rlimit_nofile 200000;
worker_rlimit_core 2000M; working_directory /tmp/cores;
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events { worker_connections 51200; use epoll; multi_accept on; accept_mutex off; # for per worker listeners }
rtmp { server { listen x.x.x.x:1935; notify_relay_redirect on; max_streams 3200; meta off;
application cid/aid {
live on;
pull rtmp://wowza/live;
}
...
Do you have any insights how to mitigate this issue?
Thanks!
@arut , sorry for the long delay. Here's the gdb output with backtrace. What else may I provide to clarify the problem? Thanks.
gdb /usr/sbin/nginx.debug core GNU gdb (GDB) 7.4.1-debian Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/sbin/nginx.debug...done. [New LWP 30422]
warning: Can't read pathname for load map: Input/output error. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `nginx: worker process '. Program terminated with signal 11, Segmentation fault.
1367 for (cctx = &ctx->publish->play; _cctx; cctx = &(_cctx)->next, ++n); (gdb) backtrace full
n = <optimized out>
racf = 0x27a5718
ctx = 0x286d980
cctx = 0x7023580aec7bedff
hash = <optimized out>
No locals.
nscf = <optimized out>
ci = {url = 0x7ffd75f03cd0, create = 0xe00000000, filter = 0x27982a0, sink = 0x348f8d8, handle = 0x2799df0, arg = 0x27a6a38, argsize = 55113944}
url = <optimized out>
ch = 0x2799df0
hh = <optimized out>
n = <optimized out>
s = 0x348f8d8
c = 0x7ffd75f03cd0
cscf = 0x279ad78
q = 0x348f930
ev = 0x348f8e0
flags = <optimized out>
timer = <optimized out>
delta = <optimized out>
worker = 0
i = <optimized out>
c = <optimized out>
on = 1
pid = 0
s = <optimized out>
i = <optimized out>
live = <optimized out>
n = <optimized out>
ch = {command = 2, pid = 24602, slot = 0, fd = -1}
ccf = <optimized out>
title = <optimized out>
p = <optimized out>
size = <optimized out>
i = <optimized out>
n = <optimized out>
sigio = 0
set = {__val = {0 <repeats 16 times>}}
itv = {it_interval = {tv_sec = 5015002, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}
live = <optimized out>
delay = 0
ls = <optimized out>
ccf = 0x27993c0
i = <optimized out>
log = 0x714ea0
cycle = 0x27982a0
init_cycle = {conf_ctx = 0x0, pool = 0x2797b70, log = 0x714ea0, new_log = {log_level = 0, file = 0x0, connection = 0, handler = 0, data = 0x0, writer = 0, wdata = 0x0, action = 0x0, next = 0x0}, log_use_stderr = 0, files = 0x0, free_connections = 0x0,
free_connection_n = 0, reusable_connections_queue = {prev = 0x0, next = 0x0}, listening = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, paths = {elts = 0x0, nelts = 0, size = 0, nalloc = 0, pool = 0x0}, open_files = {last = 0x0, part = {elts = 0x0,
nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0}, shared_memory = {last = 0x0, part = {elts = 0x0, nelts = 0, next = 0x0}, size = 0, nalloc = 0, pool = 0x0}, connection_n = 0, files_n = 0, connections = 0x0, read_events = 0x0, write_events = 0x0,
old_cycle = 0x0, conf_file = {len = 28, data = 0x7fffc3ce6e2e "rocess"}, conf_param = {len = 0, data = 0x0}, conf_prefix = {len = 11, data = 0x7fffc3ce6e2e "rocess"}, prefix = {len = 11, data = 0x4c8701 "/etc/nginx/"}, lock_file = {len = 0, data = 0x0},
hostname = {len = 0, data = 0x0}}
ccf = <optimized out>
Hi,
I have came same error in nginx 1.10 version. worker process 8912 exited on signal 11
Can you explain where is exact problem.
Still there today on Nginx 1.11.1
try add
ssl_session_cache shared:SSL:10m;
in your nginx.conf and reload nginx
Hi All!
From time to time I see this error in error.log
2014/09/15 11:34:24 [alert] 27#0: worker process 28 exited on signal 11 2014/09/15 11:44:28 [alert] 27#0: worker process 21539 exited on signal 11 2014/09/15 12:34:48 [alert] 27#0: worker process 9652 exited on signal 11 2014/09/15 12:44:53 [alert] 27#0: worker process 15638 exited on signal 11 2014/09/15 12:49:20 [alert] 27#0: worker process 32588 exited on signal 11 2014/09/15 12:53:49 [alert] 27#0: worker process 4974 exited on signal 11 2014/09/15 13:12:06 [alert] 27#0: worker process 10578 exited on signal 11 2014/09/15 13:28:17 [alert] 27#0: worker process 2807 exited on signal 11 2014/09/15 13:35:06 [alert] 27#0: worker process 30753 exited on signal 11
I don't have to much experience to debug such problem, what I tried was to attache strace to the worker process, the last few lines from the log:
kill(26701, SIGKILL) = 0 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=26701, si_status=SIGKILL, si_utime=1, si_stime=0} --- gettimeofday({1410787697, 694141}, NULL) = 0 wait4(-1, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL}], WNOHANG, NULL) = 26701 write(4, "2014/09/15 13:28:17 [alert] 2807"..., 77) = 77 wait4(-1, 0x7fff044965fc, WNOHANG, NULL) = 0 rt_sigreturn() = 0 sendto(63, "\2\0\0\0\0\0\6\4\0\0\0\0\0\1\0\0\0\1", 18, 0, NULL, 0) = -1 EPIPE (Broken pipe) --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=2807, si_uid=65534} --- close(16) = 0 close(63) = 0 --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x10} --- +++ killed by SIGSEGV +++
Can someone help to identify the root case of this problem ?
Thank you for your time.
Regards Gabor