Closed indrekj closed 6 years ago
Also:
2018/02/01 08:33:45 [error] 13#13: SUB:WEBSOCKET:fsub->tmp_pool_use_count > 0 when deleting tmp_pool. this may cause trouble...
172.24.57.0 - - [01/Feb/2018:08:33:45 +0000] "GET /?channel_id=ca5aab76-84b8-4ab6-bb02-0b282a963bcc,2711ef27-32e4-463e-91e8-21690a186182 HTTP/1.1" 499 680 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.84 Safari/537.36"
2018/02/01 08:33:59 [notice] 7#7: signal 17 (SIGCHLD) received from 13
2018/02/01 08:33:59 [alert] 7#7: worker process 13 exited on signal 11 (core dumped)
2018/02/01 08:33:59 [notice] 7#7: start worker process 16
2018/02/01 08:33:59 [notice] 7#7: signal 29 (SIGIO) received
2018/02/01 08:34:00 [notice] 16#16: nchan: Received 1 interprocess alert delayed by 9 sec.
2018/02/01 08:35:22 [error] 14#14: MEMSTORE:01: multimsg 00005564E1964DA0 timeout!!
MEMSTORE:01 part seems new
That's definitely a bug. How easy is it for you to replicate it? Could you provide me the coredump of the process that crashed right after the SUB:WEBSOCKET:fsub->tmp_pool_use_count > 0
error?
I also need:
ulimit -a
uname -a
your nginx.conf (edit out whatever you need to keep private)
nginx -V
If you'd like, it may be faster to debug this directly on your server -- email me at leo@nchan.io to set that up.
ulimit -a:
/ # ulimit -a
-f: file size (blocks) unlimited
-t: cpu time (seconds) unlimited
-d: data seg size (kb) unlimited
-s: stack size (kb) 8192
-c: core file size (blocks) unlimited
-m: resident set size (kb) unlimited
-l: locked memory (kb) 64
-p: processes unlimited
-n: file descriptors 1048576
-v: address space (kb) unlimited
-w: locks unlimited
-e: scheduling priority 0
-r: real-time priority 0
uname -a
/ # uname -a
Linux nchan-3317647813-dhv4j 4.14.11-coreos #1 SMP Fri Jan 5 11:00:14 UTC 2018 x86_64 Linux
nginx -V
/ # nginx -V
nginx version: nginx/1.13.8
built by gcc 6.4.0 (Alpine 6.4.0)
built with OpenSSL 1.0.2n 7 Dec 2017
TLS SNI support enabled
configure arguments: --prefix=/etc/nginx --sbin-path=/usr/sbin/nginx --modules-path=/usr/lib/nginx/modules --conf-path=/etc/nginx/nginx.conf --error-log-path=/var/log/nginx/error.log --http-log-path=/var/log/nginx/access.log --pid-path=/var/run/nginx.pid --lock-path=/var/run/nginx.lock --http-client-body-temp-path=/var/cache/nginx/client_temp --http-proxy-temp-path=/var/cache/nginx/proxy_temp --http-fastcgi-temp-path=/var/cache/nginx/fastcgi_temp --http-uwsgi-temp-path=/var/cache/nginx/uwsgi_temp --http-scgi-temp-path=/var/cache/nginx/scgi_temp --user=nginx --group=nginx --with-http_ssl_module --with-http_realip_module --with-http_addition_module --with-http_sub_module --with-http_dav_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-http_gzip_static_module --with-http_random_index_module --with-http_secure_link_module --with-http_stub_status_module --with-http_auth_request_module --with-http_xslt_module=dynamic --with-http_image_filter_module=dynamic --with-http_geoip_module=dynamic --with-http_perl_module=dynamic --with-threads --with-stream --with-stream_ssl_module --with-http_slice_module --with-mail --with-mail_ssl_module --with-file-aio --with-http_v2_module --with-ipv6 --add-dynamic-module=/usr/src/nchan-1.1.14
nginx config:
daemon off;
user nginx;
worker_processes 1;
error_log /dev/stderr info;
pid /var/run/nginx.pid;
load_module "modules/ngx_nchan_module.so";
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /dev/stdout;
server {
listen 80;
proxy_read_timeout 1d;
nchan_websocket_ping_interval 5;
location = /ack {
return 204;
}
location = /publish {
nchan_publisher;
nchan_channel_id $arg_channel_id;
nchan_message_buffer_length 50;
nchan_message_timeout 1h;
}
location = / {
nchan_pubsub;
nchan_subscriber_channel_id $arg_channel_id;
nchan_channel_id_split_delimiter ",";
nchan_subscriber_last_message_id $arg_last_message_id;
nchan_publisher_channel_id ackchannel;
nchan_publisher_upstream_request /ack;
}
}
}
I'll try to reproduce it locally.
Last week I removed nchan_publisher_upstream_request
from the configuration and it seemed to get stable. Will let you know if there are any more incidents.
Thanks -- I'll let you know when i find it.
Can you retest this in master to see if this issue persists? I haven't addressed it directly, but I refactors a whole bunch of possibly related code, and may have incidentally fixed this issue.
This error is now impossible in version 1.2.0, as the relevant code has been removed and refactored.
nchan: v1.1.14 nginx: v1.13.8
This happens time to time. The server currently has quite low traffic.
This usually ends up with: