kvspb / nginx-auth-ldap

LDAP authentication module for nginx
BSD 2-Clause "Simplified" License
739 stars 253 forks source link

Worker Processes Get Stuck in Indefinite Loop eating 100% CPU #62

Open bhudgens opened 10 years ago

bhudgens commented 10 years ago

Summary

We have seen a condition where nginx-auth-ldap gets stuck in an indefinite loop using 100% CPU. We are unable to reliably reproduce this issue and we do not know what condition causes the worker threads to get stuck in this spin. The server is heavily active and it's possible this condition occurs due a temporary loss-of-connection to our LDAP servers. This is purely conjecture.

Symptom

NGINX Worker Process uses 100% CPU indefinitely.

BackTrace

(gdb) bt
#0  ngx_http_auth_ldap_close_connection (c=c@entry=0x1146128) at /var/starphleet/nginx/nginx-auth-ldap/ngx_http_auth_ldap_module.c:985
#1  0x0000000000485138 in ngx_http_auth_ldap_read_handler (rev=0x7f12d6827c08) at /var/starphleet/nginx/nginx-auth-ldap/ngx_http_auth_ldap_module.c:1288
#2  0x00000000004253f2 in ngx_epoll_process_events (cycle=<optimized out>, timer=<optimized out>, flags=<optimized out>) at src/event/modules/ngx_epoll_module.c:685
#3  0x000000000041dffd in ngx_process_events_and_timers (cycle=cycle@entry=0x140acb0) at src/event/ngx_event.c:248
#4  0x0000000000423c01 in ngx_worker_process_cycle (cycle=0x140acb0, data=<optimized out>) at src/os/unix/ngx_process_cycle.c:822
#5  0x0000000000422696 in ngx_spawn_process (cycle=cycle@entry=0x140acb0, proc=proc@entry=0x423b35 <ngx_worker_process_cycle>, data=data@entry=0x0, name=name@entry=0x48b5c6 "worker process", respawn=respawn@entry=-4) at src/os/unix/ngx_process.c:198
#6  0x0000000000423d5a in ngx_start_worker_processes (cycle=cycle@entry=0x140acb0, n=1, type=type@entry=-4) at src/os/unix/ngx_process_cycle.c:368
#7  0x0000000000424bd1 in ngx_master_process_cycle (cycle=0x140acb0, cycle@entry=0xc28540) at src/os/unix/ngx_process_cycle.c:253
#8  0x0000000000408253 in main (argc=<optimized out>, argv=<optimized out>) at src/core/nginx.c:407
(gdb) n
990         q = ngx_queue_next(q);
(gdb) n
985     while (q != ngx_queue_sentinel(&c->server->free_connections)) {
(gdb) n
990         q = ngx_queue_next(q);
(gdb) n
985     while (q != ngx_queue_sentinel(&c->server->free_connections)) {
(gdb) n
990         q = ngx_queue_next(q);
jbq commented 9 years ago

Just happened to me as well with connections 100 running for a few days. I will revert to connections 1, it showed to be safer.

coxley commented 9 years ago

We are experiencing this same issue using connections 10. Will try connections 1 but has been a serious problem.