qaulau / modwsgi

Automatically exported from code.google.com/p/modwsgi
0 stars 0 forks source link

inactivity_timeout consumes 100% CPU #46

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
After setting inactivity_timeout apache consumes 100% CPU protracted 

OS: Fedora8
mod_wsgi: 2.0c3

after downgrade to 2.0c2 all runs normal

Original issue reported on code.google.com by kro...@gmail.com on 3 Dec 2007 at 3:04

GoogleCodeExporter commented 9 years ago
forgot to say, the same "things" with trunk

Original comment by kro...@gmail.com on 3 Dec 2007 at 3:09

GoogleCodeExporter commented 9 years ago
Can you supply the exact Apache configuration section you are using to configure
mod_wsgi? The whole Apache configuration is not needed, mainly just the
WSGIDaemonProcess directives.

Are you loading into Apache any other notable Apache modules such as 
mod_python, PHP,
mod_perl etc?

Does Fedora 8 finally provide a properly installed Python which includes a 
shared
library for Python and is mod_wsgi using that shared library? Run 'ldd' on
mod_wsgi.so Apache module file and supply output.

Is the problem occuring immediately at startup of Apache, or only after some 
period?

Apache with mod_wsgi daemon processes involves many processes, which process is 
using
100%, just the mod_wsgi daemon process? Consult the Apache error logs to find 
out
daemon process pid and monitor it. Problem may actually be that daemon process 
is
crashing causing Apache parent process to tight loop continually creating 
replacement
daemon process.

Can you set Apache directive LogLevel to at least 'info' (using 'debug' even 
better)
and supply the batch of mod_wsgi error log messages output? This will help 
validate
whether daemon process is in fact crashing or not. Will also help in 
understanding
where in initialisation steps daemon process gets to. If you are using other 
Apache
modules such as mod_python etc, ensure you provide any messages they output as 
well.

Finally, what version of Python are you using?

Original comment by Graham.Dumpleton@gmail.com on 3 Dec 2007 at 9:10

GoogleCodeExporter commented 9 years ago
WSGIDaemonProcess db1 processes=2 user=kron group=kron threads=5
maximum-requests=1100 home=/home/kron/hosts/sites_dir/links 
inactivity-timeout=300

mod_php, mod_dav_svn (no mod_python, mod_perl)

# ldd /usr/lib/httpd/modules/mod_wsgi.so
        linux-gate.so.1 =>  (0x00110000)
        libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0x00129000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00270000)
        libdl.so.2 => /lib/libdl.so.2 (0x00289000)
        libutil.so.1 => /lib/libutil.so.1 (0x0028e000)
        libc.so.6 => /lib/libc.so.6 (0x0037d000)
        libm.so.6 => /lib/libm.so.6 (0x00292000)
        /lib/ld-linux.so.2 (0x00360000)

apache begun to "east" CPU immediately at startup
in htop I saw that only my two processes consumes CPU

$ python
Python 2.5.1 (r251:54863, Oct 30 2007, 13:54:11) 
[GCC 4.1.2 20070925 (Red Hat 4.1.2-33)] on linux2

Original comment by kro...@gmail.com on 4 Dec 2007 at 5:11

GoogleCodeExporter commented 9 years ago
ldd /usr/lib/httpd/modules/mod_wsgi.so
        linux-gate.so.1 =>  (0x00110000)
        libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0x0012a000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00271000)
        libdl.so.2 => /lib/libdl.so.2 (0x0028a000)
        libutil.so.1 => /lib/libutil.so.1 (0x0028f000)
        libc.so.6 => /lib/libc.so.6 (0x0037d000)
        libm.so.6 => /lib/libm.so.6 (0x00293000)
        /lib/ld-linux.so.2 (0x00360000)

previews ldd output was for 2.0c2, this one for 2.0c3

this is LogLevel debug output for my Vhost wihch uses inactivity-timeout
[Tue Dec 04 19:14:49 2007] [info] mod_wsgi (pid=12266): Enable monitor thread in
process 'db1'.
[Tue Dec 04 19:14:49 2007] [info] mod_wsgi (pid=12268): Enable monitor thread in
process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7277): mod_wsgi (pid=12266): 
Starting 5
threads in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12266): 
Starting
thread 1 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12266): 
Starting
thread 2 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12266): 
Starting
thread 3 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12266): 
Starting
thread 4 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12266): 
Starting
thread 5 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [info] mod_wsgi (pid=12266): Enable deadlock thread 
in
process 'db1'.
[Tue Dec 04 19:14:49 2007] [info] mod_wsgi (pid=12268): Enable deadlock thread 
in
process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7277): mod_wsgi (pid=12268): 
Starting 5
threads in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12268): 
Starting
thread 1 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12268): 
Starting
thread 2 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12268): 
Starting
thread 3 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12268): 
Starting
thread 4 in daemon process 'db1'.
[Tue Dec 04 19:14:49 2007] [debug] mod_wsgi.c(7283): mod_wsgi (pid=12268): 
Starting
thread 5 in daemon process 'db1'.

nothing else but CPU at 100%

Original comment by kro...@gmail.com on 4 Dec 2007 at 5:16

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
Can you look for the code:

    entry->deadlock_timeout = apr_time_from_sec(deadlock_timeout);
    entry->inactivity_timeout = apr_time_from_sec(inactivity_timeout);

and add immediately after this:

                ap_log_error(APLOG_MARK, WSGI_LOG_INFO(0),
                             wsgi_server, "mod_wsgi (pid=%d): "
                             "deadlock = %" APR_TIME_T_FMT,
                             getpid(), entry->deadlock_timeout);
                ap_log_error(APLOG_MARK, WSGI_LOG_INFO(0),
                             wsgi_server, "mod_wsgi (pid=%d): "
                             "inactivity = %" APR_TIME_T_FMT,
                             getpid(), entry->inactivity_timeout);

To avoid it being an issue, set the WSGIDaemonProcess option:

  deadlock-timeout=0

Also look for code:

        apr_sleep(period);

Before this line, add:

        if (period < apr_time_from_sec(1.0))
            period = apr_time_from_sec(1.0);

Tell me if that makes any difference, plus send output from the extra logging.

If that helps, you might add extra logging where that apr_sleep() is and before
modifying period, log what its value is.

Original comment by Graham.Dumpleton@gmail.com on 4 Dec 2007 at 11:07

GoogleCodeExporter commented 9 years ago
This what I got after your instructions

Index: mod_wsgi.c
===================================================================
--- mod_wsgi.c  (revision 707)
+++ mod_wsgi.c  (working copy)
@@ -6543,7 +6543,16 @@
     entry->shutdown_timeout = shutdown_timeout;
     entry->deadlock_timeout = apr_time_from_sec(deadlock_timeout);
     entry->inactivity_timeout = apr_time_from_sec(inactivity_timeout);
-
+    
+    ap_log_error(APLOG_MARK, WSGI_LOG_INFO(0),
+        wsgi_server, "mod_wsgi (pid=%d): "
+        "deadlock = %" APR_TIME_T_FMT,
+        getpid(), entry->deadlock_timeout);
+    ap_log_error(APLOG_MARK, WSGI_LOG_INFO(0),
+        wsgi_server, "mod_wsgi (pid=%d): "
+        "inactivity = %" APR_TIME_T_FMT,
+        getpid(), entry->inactivity_timeout);
+    
     return NULL;
 }

@@ -7318,6 +7327,9 @@
             wsgi_daemon_shutdown++;
             kill(getpid(), SIGINT);
         }
+        
+        if (period < apr_time_from_sec(1.0))
+            period = apr_time_from_sec(1.0);

         apr_sleep(period);
     }

After I've added 
if (period < apr_time_from_sec(1.0))
    period = apr_time_from_sec(1.0);
apache stoped eat CPU

WSGIDaemonProcess db1 processes=2 user=kron group=kron threads=5
maximum-requests=1100 home=/home/kron/hosts/site_dir/links inactivity-timeout=20
deadlock-timeout=0

Here is log
[Wed Dec 05 09:48:31 2007] [info] mod_wsgi (pid=18689): Enable monitor thread in
process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7414): mod_wsgi (pid=18689): 
Starting 5
threads in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18689): 
Starting
thread 1 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18689): 
Starting
thread 2 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18689): 
Starting
thread 3 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18689): 
Starting
thread 4 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18689): 
Starting
thread 5 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [info] mod_wsgi (pid=18688): Enable monitor thread in
process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7414): mod_wsgi (pid=18688): 
Starting 5
threads in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18688): 
Starting
thread 1 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18688): 
Starting
thread 2 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18688): 
Starting
thread 3 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18688): 
Starting
thread 4 in daemon process 'db1'.
[Wed Dec 05 09:48:31 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18688): 
Starting
thread 5 in daemon process 'db1'.
[Wed Dec 05 09:48:41 2007] [info] mod_wsgi (pid=18689): Create interpreter
'db.designcom.ath.cx|'.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Daemon process 
inactivity
timer expired, stopping process 'db1'.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Shutdown requested 
'db1'.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Stopping process 'db1'.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Destroy interpreter
'db.designcom.ath.cx|'.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Cleanup interpreter ''.
[Wed Dec 05 09:49:02 2007] [info] mod_wsgi (pid=18689): Terminating Python.
[Wed Dec 05 09:49:03 2007] [info] mod_wsgi (pid=18715): Enable monitor thread in
process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7414): mod_wsgi (pid=18715): 
Starting 5
threads in daemon process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18715): 
Starting
thread 1 in daemon process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18715): 
Starting
thread 2 in daemon process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18715): 
Starting
thread 3 in daemon process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18715): 
Starting
thread 4 in daemon process 'db1'.
[Wed Dec 05 09:49:03 2007] [debug] mod_wsgi.c(7420): mod_wsgi (pid=18715): 
Starting
thread 5 in daemon process 'db1'.

Original comment by kro...@gmail.com on 5 Dec 2007 at 7:52

GoogleCodeExporter commented 9 years ago
Can you find the log lines for:

  mod_wsgi (pid=?):  deadlock = ?
  mod_wsgi (pid=?):  inactivity = ?

If you have configured virtual host specific error log files, you will need to 
look in the main Apache error log file 
and not that for the virtual host.

Original comment by Graham.Dumpleton@gmail.com on 5 Dec 2007 at 8:30

GoogleCodeExporter commented 9 years ago
unfortunately I don't see those lines in main error_log or vhost_error_log
here is mail error_log

[Wed Dec 05 10:34:41 2007] [info] mod_wsgi: Initializing Python.
[Wed Dec 05 10:34:41 2007] [debug] mod_wsgi.c(6870): mod_wsgi (pid=3141): 
Socket for
'internal-trac' is '/etc/httpd/run.3141.0.1.sock'.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3143): Starting process
'internal-trac' with uid=48, gid=48 and threads=5.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3143): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3151): Starting process
'internal-trac' with uid=48, gid=48 and threads=5.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3151): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [debug] mod_wsgi.c(6870): mod_wsgi (pid=3141): 
Socket for
'db1' is '/etc/httpd/run.3141.0.2.sock'.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3159): Starting process 'db1' 
with
uid=501, gid=501 and threads=5.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3159): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3167): Starting process 'db1' 
with
uid=501, gid=501 and threads=5.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3167): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3175): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3176): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3177): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3178): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3179): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3180): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3181): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [info] mod_wsgi (pid=3182): Attach interpreter ''.
[Wed Dec 05 10:34:41 2007] [notice] Apache/2.2.6 (Unix) DAV/2 SVN/1.4.4 
PHP/5.2.4
mod_wsgi/2.0c4-TRUNK Python/2.5.1 configured -- resuming normal operations
[Wed Dec 05 10:34:41 2007] [info] Server built: Sep 18 2007 03:54:41
[Wed Dec 05 10:34:41 2007] [debug] prefork.c(991): AcceptMutex: sysvsem 
(default:
sysvsem)

Original comment by kro...@gmail.com on 5 Dec 2007 at 8:41

GoogleCodeExporter commented 9 years ago
Change WSGI_LOG_INFO to WSGI_LOG_ALERT in those extra debug code lines.

At that point the log level is probably the default compiled in one which only 
shows up to ERR I think.

Original comment by Graham.Dumpleton@gmail.com on 5 Dec 2007 at 8:56

GoogleCodeExporter commented 9 years ago
this outputs direct into console
[root@coldfire mod_wsgi]# /etc/init.d/httpd start
Запускается httpd: [Wed Dec 05 11:01:34 2007] [alert] mod_wsgi 
(pid=4310): deadlock =
300000000
[Wed Dec 05 11:01:34 2007] [alert] mod_wsgi (pid=4310): inactivity = 300000000
[Wed Dec 05 11:01:34 2007] [alert] mod_wsgi (pid=4310): deadlock = 300000000
[Wed Dec 05 11:01:34 2007] [alert] mod_wsgi (pid=4310): inactivity = 300000000
httpd: Could not reliably determine the server's fully qualified domain name, 
using
127.0.0.1 for ServerName
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
[Wed Dec 05 11:01:34 2007] [warn] NameVirtualHost *:80 has no VirtualHosts
                                                           [  OK  ]

WSGIDaemonProcess ... inactivity-timeout=300 in Vhost's config

Original comment by kro...@gmail.com on 5 Dec 2007 at 9:03

GoogleCodeExporter commented 9 years ago
Caused by race condition between threads. Trying to work out fix now. :-)

Original comment by Graham.Dumpleton@gmail.com on 5 Dec 2007 at 9:27

GoogleCodeExporter commented 9 years ago
thank you!

Original comment by kro...@gmail.com on 5 Dec 2007 at 9:29

GoogleCodeExporter commented 9 years ago
Fixed in revision 708 of mod_wsgi trunk.

Diff file also attached.

What was occurring was that period determined for first sleep could have been 
calculated before deadlock 
thread had initialised a time marker. This resulted in period for sleep being 
zero. Depending on fairness in 
thread scheduler and mutex locking the monitor thread may monopolise CPU time, 
continually looping with a 
sleep period of 0 and causing spike in CPU load. Technically, the deadlock 
thread should have eventually done 
its initialisation and the looping with period of 0 should have stopped, but 
anyone guess as to how thread 
scheduler for any particular OS coped with it.

Setting deadlock-timeout option for WSGIDaemonProcess to 0 would avoid problem 
if fix to code can't be 
applied.

Original comment by Graham.Dumpleton@gmail.com on 5 Dec 2007 at 10:05

Attachments:

GoogleCodeExporter commented 9 years ago

Original comment by Graham.Dumpleton@gmail.com on 5 Dec 2007 at 10:05

GoogleCodeExporter commented 9 years ago
Just tested, works fine. Thank you for your wonderful mod_wsgi!

Original comment by kro...@gmail.com on 5 Dec 2007 at 10:20

GoogleCodeExporter commented 9 years ago
Fix for this in 2.0c4.

Original comment by Graham.Dumpleton@gmail.com on 14 Dec 2007 at 11:45

GoogleCodeExporter commented 9 years ago
Version 2.0 of mod_wsgi now released with this fix.

Original comment by Graham.Dumpleton@gmail.com on 20 Mar 2008 at 4:51