Open fancy-rabbit opened 13 years ago
Thanks. What version of nginx are you using?
0.8.53 built with your modified upstream hash module. but even I'm not using the upstream hash module, reload causes segfaults.
nginx version: 0.8.53 built by gcc 4.1.2 20080704 (Red Hat 4.1.2-48) --prefix=/opt/xxx/nginx --with-http_stub_status_module --with-http_realip_module --pid-path=/var/run/nginx.pid --add-module=/usr/src/redhat/SOURCES/ngx_cache_purge-1.2 --add-module=/usr/src/redhat/SOURCES/nginx_upstream_hash_with_healthcheck-0.3.1 --add-module=/usr/src/redhat/SOURCES/nginx_healthcheck_for_upstreams
I'll look at it when I can. I've heard of this behavior before from other people, but just ignored it since nginx still worked before and after (The old process was segfaulting not the new one). Is that the case for you? If you build with debug symbols, what stack trace do you see in a coredump?
1350 if (cycle[i]->connections[n].fd != (ngx_socket_t) -1) { (gdb) bt
I've encountered a similar problem, but (at least looks like) solved by adding 4 lines of checking code to ngx_http_healthcheck_clear_events. --------code diff--------
void ngx_http_healthcheck_clear_events(ngx_log_t *log) {
ngx_uint_t i;
ngx_log_debug0(NGX_LOG_DEBUG_HTTP, log, 0,
"healthcheck: Clearing events");
// Note: From what I can tell it is safe to ngx_del_timer events
// that are not in the event tree
for (i=0; i<ngx_http_healthchecks_arr->nelts; i++) {
+ if (ngx_http_healthchecks[i].conf->healthcheck_enabled) {
+ if (ngx_http_healthchecks[i].health_ev.timer_set)
ngx_del_timer(&ngx_http_healthchecks[i].health_ev);
+ if (ngx_http_healthchecks[i].ownership_ev.timer_set)
ngx_del_timer(&ngx_http_healthchecks[i].ownership_ev);
+ }
}
}
--------error log-------- May 28 17:54:48 tc_69_88 kernel: nginx[18763]: segfault at 8 ip 00000000004117c6 sp 00007fffe61be000 error 4 in nginx[400000+b0000] May 28 17:54:48 tc_69_88 abrt[18772]: saved core dump of pid 18763 (/tmp/nginx/sbin/nginx) to /var/spool/abrt/ccpp-1338198888-18763.new/coredump (1273856 bytes) May 28 17:54:48 tc_69_88 abrtd: Directory 'ccpp-1338198888-18763' creation detected May 28 17:54:48 tc_69_88 abrtd: Executable '/tmp/nginx/sbin/nginx' doesn't belong to any package May 28 17:54:48 tc_69_88 abrtd: Corrupted or bad crash /var/spool/abrt/ccpp-1338198888-18763 (res:4), deleting --------GDB backtrace--------
in /var/log/messages there're lots of kernel: nginx[28164]: segfault at 0000000000000018 rip 0000000000410a8f rsp 00007fffce7ff7d0 error 4 messages. When I reload nginx, the pid 28164 belongs to the shutting down nginx worker_process. Every time I reload nginx there're segfaults, unless I delete all the healthcheck directives from its configuration file.
core dump shows this: Program terminated with signal 11, Segmentation fault.
0 0x0000000000410a8f in time ()
(gdb) bt
0 0x0000000000410a8f in time ()
1 0x0000000000417879 in time ()
2 0x00000000004177a5 in time ()
3 0x000000000041c49e in time ()
4 0x000000000040424b in time ()
5 0x0000003abac1d994 in __libc_start_main () from /lib64/libc.so.6
6 0x0000000000402a59 in time ()
7 0x00007fffce7ffb38 in ?? ()
8 0x0000000000000000 in ?? ()