Closed streamingsystems closed 1 year ago
More info. The stack trace:
I am wondering if it has to do with this fix in nginx:
https://github.com/nginx/nginx/commit/d52e5684437aedd6bb74d6b2b602b7306f4212ba
Or this fix:
https://github.com/nginx/nginx/commit/593bb543873ac57efade09dbcc2f93ab6b4d5ab0
Your nginx.conf file?
My nginx.conf:
worker_processes auto;
...
rtmp_auto_push on;
Nginx version:
root@debian:~# /usr/local/nginx/sbin/nginx -V
nginx version: nginx/1.24.0
built by gcc 10.2.1 20210110 (Debian 10.2.1-6)
built with OpenSSL 1.1.1n 15 Mar 2022
TLS SNI support enabled
configure arguments: --with-debug --add-module=../nginx-http-flv-module
Executed "reload" but didn't reproduce the bug:
2023/05/14 16:14:14 [notice] 14873#0: signal process started
2023/05/14 16:15:28 [notice] 14976#0: signal process started
2023/05/14 16:18:08 [notice] 15098#0: signal process started
2023/05/14 16:28:53 [notice] 15472#0: signal process started
I fixed this bug.
This is what I found:
In the nginx_rtmp_auto_push_module.c on line 323:
There is this code:
old = ls->previous->connection;
if (ngx_del_event(old->read, NGX_READ_EVENT, NGX_CLOSE_EVENT)
== NGX_ERROR)
{
return NGX_ERROR;
}
old->fd = (ngx_socket_t) -1;
What I am seeing if that "old" is null, so when this code is executing it's crashing because it's trying to do old->read on null.
It happens on every server we have when we issue a "reload".
To fix it I made this change:
old = ls->previous->connection;
if (old) {
if (ngx_del_event(old->read, NGX_READ_EVENT, NGX_CLOSE_EVENT)
== NGX_ERROR)
{
return NGX_ERROR;
}
old->fd = (ngx_socket_t) -1;
}
It now reloads perfectly in our entire environment.
I am wondering what conditions in the new 1.24 would cause ls->previous->connection to be null, which would mean that the old variable is null, which would crash the old->read.
/sbin/nginx -V nginx version: nginx/1.24.0 built by gcc 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC) built with OpenSSL 1.1.1k FIPS 25 Mar 2021 TLS SNI support enabled configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib64/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-mail --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module --with-stream_ssl_preread_module --with-openssl-opt=enable-ktls --with-cc-opt='-O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -fPIC' --with-ld-opt='-Wl,-z,relro -Wl,-z,now -pie'
We have our configs broken up into different files but this is the various configs:
rtmp_auto_push on;
rtmp {
out_queue 4096;
out_cork 8;
max_streams 128;
timeout 10s;
drop_idle_publisher 10s;
respawn off;
server {
.....
}
}
user nginx; worker_processes auto; worker_rlimit_nofile 32768; pid /run/nginx.pid;
include /usr/share/nginx/modules/*.conf;
events { worker_connections 2048; use epoll; multi_accept on; }
I found the reason: if the module is compiled as a dynamic module, the bug will be 100% reproduced. What you mentioned above doesn't perfectly resolve the issue and will lead to socket leak. I will fix the issue soon. Thank you for reporting the bug!
Great thanks for the information. While you’re in there fixing that bug do you want to also look at the proxy protocol and why it’s not working? I can show you the code changes that I made.On May 15, 2023, at 6:05 AM, winshining @.***> wrote: I found the reason: if the module is compiled as a dynamic module, the bug will be 100% reproduced. What you mentioned above doesn't perfectly resolve the issue and will lead to socket leak. I will fix the issue soon. Thank you for reporting the bug!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
Fixed, please fetch the latest code to test. I will take a look at proxy protocol issue you have posted to me in my spare time.
Hi All,
We recently upgraded to nginx 1.24 and compiled this module properly (as we have always done). The module works great still, however after a "reload" of nginx we are seeing this in our nginx error.log and it keeps repeating forever and causes the module to be inoperable. When we do a nginx "restart" everything goes back to working fine.
When I turn auto_push "off" the problem does not happen, so it's something specific to the auto push.
This has been working fine for years up until a recent upgrade to nginx 1.24.
Any help would be appreciated.
2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310276 2023/05/13 10:27:04 [debug] 3310275#3310275: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310275#3310275: auto_push: create socket '/tmp/nginx-http-flv.9' 2023/05/13 10:27:04 [debug] 3310275#3310275: auto_push: delete existing socket '/tmp/nginx-http-flv.9' 2023/05/13 10:27:04 [debug] 3310276#3310276: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310276#3310276: auto_push: create socket '/tmp/nginx-http-flv.15' 2023/05/13 10:27:04 [debug] 3310276#3310276: auto_push: delete existing socket '/tmp/nginx-http-flv.15' 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310270 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310270 exited on signal 11 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310272 exited on signal 11 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310277 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310278 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 29 (SIGIO) received 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310272 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310269 exited on signal 11 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310279 2023/05/13 10:27:04 [debug] 3310277#3310277: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310277#3310277: auto_push: create socket '/tmp/nginx-http-flv.10' 2023/05/13 10:27:04 [debug] 3310277#3310277: auto_push: delete existing socket '/tmp/nginx-http-flv.10' 2023/05/13 10:27:04 [debug] 3310278#3310278: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310278#3310278: auto_push: create socket '/tmp/nginx-http-flv.11' 2023/05/13 10:27:04 [debug] 3310278#3310278: auto_push: delete existing socket '/tmp/nginx-http-flv.11' 2023/05/13 10:27:04 [debug] 3310279#3310279: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310279#3310279: auto_push: create socket '/tmp/nginx-http-flv.12' 2023/05/13 10:27:04 [debug] 3310279#3310279: auto_push: delete existing socket '/tmp/nginx-http-flv.12' 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 29 (SIGIO) received 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 29 (SIGIO) received 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310271 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310271 exited on signal 11 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310280 2023/05/13 10:27:04 [debug] 3310280#3310280: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310280#3310280: auto_push: create socket '/tmp/nginx-http-flv.14' 2023/05/13 10:27:04 [debug] 3310280#3310280: auto_push: delete existing socket '/tmp/nginx-http-flv.14' 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310276 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310276 exited on signal 11 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310281 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 29 (SIGIO) received 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 29 (SIGIO) received 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310273 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310273 exited on signal 11 2023/05/13 10:27:04 [alert] 2190648#2190648: worker process 3310274 exited on signal 11 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310282 2023/05/13 10:27:04 [notice] 2190648#2190648: start worker process 3310283 2023/05/13 10:27:04 [notice] 2190648#2190648: signal 17 (SIGCHLD) received from 3310274 2023/05/13 10:27:04 [debug] 3310281#3310281: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310281#3310281: auto_push: create socket '/tmp/nginx-http-flv.15' 2023/05/13 10:27:04 [debug] 3310281#3310281: auto_push: delete existing socket '/tmp/nginx-http-flv.15' 2023/05/13 10:27:04 [debug] 3310282#3310282: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310282#3310282: auto_push: create socket '/tmp/nginx-http-flv.8' 2023/05/13 10:27:04 [debug] 3310282#3310282: auto_push: delete existing socket '/tmp/nginx-http-flv.8' 2023/05/13 10:27:04 [debug] 3310283#3310283: auto_push: creating sockets 2023/05/13 10:27:04 [debug] 3310283#3310283: auto_push: create socket '/tmp/nginx-http-flv.13' 2023/05/13 10:27:04 [debug] 3310283#3310283: auto_push: delete existing socket '/tmp/nginx-http-flv.13'