Closed GoogleCodeExporter closed 9 years ago
Apache and other modules for some versions of Apache have been known to leak
memory on graceful restart.
Can you post similar output but for where mod_wsgi isn't been loaded at all so
comparison can be drawn to
when mod_wsgi not being used.
Original comment by Graham.Dumpleton@gmail.com
on 19 Aug 2008 at 1:03
ok, i'll post it at evening
Original comment by d.lex...@gmail.com
on 19 Aug 2008 at 9:58
Perhaps not relevant, but mod_python had an issue with memory leaks on restart
and graceful restart:
https://issues.apache.org/jira/browse/MODPYTHON-235
It was suggested this only seemed to occur when certain auth modules were
loaded which was odd.
It is known however that how mod_python handled Python initialisation and
destruction wasn't correct, but
mod_wsgi aimed at doing it properly.
Anyway, does highlight that for mod_wsgi case one should also look at whether
memory leak occurs when
doing normal 'restart' in addition to 'graceful restart'.
Original comment by Graham.Dumpleton@gmail.com
on 19 Aug 2008 at 11:49
-----------------
Apache without all modules (no leak)
-----------------
root@ad-desktop:~/www# ps aux | grep apache2
root 6609 0.0 0.0 9556 1932 ? Ss 23:29 0:00
/usr/sbin/apache2 -k
start
www-data 6610 0.0 0.0 230892 1852 ? Sl 23:29 0:00
/usr/sbin/apache2 -k
start
www-data 6613 0.0 0.0 230892 1856 ? Sl 23:29 0:00
/usr/sbin/apache2 -k
start
root 6684 0.0 0.0 3016 780 pts/1 R+ 23:29 0:00 grep apache2
root@ad-desktop:~/www# apache2ctl graceful
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:~/www# ps aux | grep apache2
root 6609 0.0 0.0 9556 1960 ? Ss 23:29 0:00
/usr/sbin/apache2 -k
start
www-data 6691 0.0 0.0 230892 1864 ? Sl 23:30 0:00
/usr/sbin/apache2 -k
start
www-data 6694 0.0 0.0 230892 1864 ? Sl 23:30 0:00
/usr/sbin/apache2 -k
start
root 6748 0.0 0.0 3016 784 pts/1 S+ 23:30 0:00 grep apache2
root@ad-desktop:~/www# apache2ctl graceful
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:~/www# ps aux | grep apache2
root 6609 0.0 0.0 9556 1968 ? Ss 23:29 0:00
/usr/sbin/apache2 -k
start
www-data 6754 0.0 0.0 230892 1868 ? Sl 23:30 0:00
/usr/sbin/apache2 -k
start
www-data 6757 0.0 0.0 230892 1872 ? Sl 23:30 0:00
/usr/sbin/apache2 -k
start
root 6811 0.0 0.0 3016 780 pts/1 R+ 23:30 0:00 grep apache2
----------------------
Apache with only mod-wsgi 3.0-trunk loaded (872kb leak)
----------------------
root@ad-desktop:/etc/apache2# ps aux | grep apache2
root 6856 0.0 0.1 12340 3912 ? Ss 23:32 0:00
/usr/sbin/apache2 -k
start
www-data 6857 0.0 0.1 233676 3272 ? Sl 23:32 0:00
/usr/sbin/apache2 -k
start
www-data 6860 0.0 0.1 233676 3276 ? Sl 23:32 0:00
/usr/sbin/apache2 -k
start
root 6917 0.0 0.0 3016 780 pts/1 S+ 23:32 0:00 grep apache2
root@ad-desktop:/etc/apache2# apache2ctl graceful
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:/etc/apache2# ps aux | grep apache2
root 6856 0.1 0.2 13212 5096 ? Ss 23:32 0:00
/usr/sbin/apache2 -k
start
www-data 6924 0.0 0.1 234548 4000 ? Sl 23:32 0:00
/usr/sbin/apache2 -k
start
www-data 6953 0.0 0.1 234548 4004 ? Sl 23:32 0:00
/usr/sbin/apache2 -k
start
root 6982 0.0 0.0 3016 780 pts/1 S+ 23:32 0:00 grep apache2
root@ad-desktop:/etc/apache2# apache2ctl graceful
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:/etc/apache2# ps aux | grep apache2
root 6856 0.2 0.2 14084 5820 ? Ss 23:32 0:00
/usr/sbin/apache2 -k
start
www-data 6924 0.1 0.0 0 0 ? Z 23:32 0:00 [apache2]
<defunct>
www-data 6988 0.0 0.2 235420 4728 ? Sl 23:32 0:00
/usr/sbin/apache2 -k
start
root 7017 0.0 0.0 3016 784 pts/1 S+ 23:32 0:00 grep apache2
--------------------------
Leak size had not dependency on mod-wsgi directives exists in virtual host
config
(tried blank vhost config, but leak always 872kb)
Also tried to enlarge virtual host config with dummy WSGIScriptAlias
directives, but
leak size was not changed.
Original comment by d.lex...@gmail.com
on 19 Aug 2008 at 7:48
Oh,
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
FreeBSD apache mod-wsgi configuration also leaks
Original comment by d.lex...@gmail.com
on 19 Aug 2008 at 7:58
Ok, looks like it not only graceful restart problem. Now i doing restart but
leak occurs
-------------------------
root@ad-desktop:~/www# ps aux | grep apache2
root 7604 0.0 0.2 14100 5852 ? Ss Aug19 0:00
/usr/sbin/apache2 -k
start
www-data 7823 0.0 0.2 235436 4736 ? Sl Aug19 0:00
/usr/sbin/apache2 -k
start
www-data 7849 0.0 0.2 235436 4740 ? Sl Aug19 0:00
/usr/sbin/apache2 -k
start
root 8227 0.0 0.0 3016 764 pts/1 R+ 00:00 0:00 grep apache2
root@ad-desktop:~/www# apache2ctl restart
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:~/www# ps aux | grep apache2
root 7604 0.0 0.3 14968 6572 ? Ss Aug19 0:00
/usr/sbin/apache2 -k
start
www-data 8240 0.0 0.2 236304 5456 ? Sl 00:00 0:00
/usr/sbin/apache2 -k
start
www-data 8244 0.0 0.2 236304 5460 ? Sl 00:00 0:00
/usr/sbin/apache2 -k
start
root 8333 0.0 0.0 3016 772 pts/1 R+ 00:00 0:00 grep apache2
root@ad-desktop:~/www# apache2ctl restart
apache2: Could not reliably determine the server's fully qualified domain name,
using
127.0.1.1 for ServerName
root@ad-desktop:~/www# ps aux | grep apache2
root 7604 0.0 0.3 15840 7292 ? Ss Aug19 0:00
/usr/sbin/apache2 -k
start
www-data 8345 0.0 0.2 237176 6176 ? Sl 00:00 0:00
/usr/sbin/apache2 -k
start
www-data 8350 0.0 0.2 237176 6180 ? Sl 00:00 0:00
/usr/sbin/apache2 -k
start
root 8434 0.0 0.0 3016 780 pts/1 R+ 00:00 0:00 grep apache2
root@ad-desktop:~/www#
Original comment by d.lex...@gmail.com
on 19 Aug 2008 at 8:02
FreeBSD-7.0-RELEASE, ULE scheduler, Apache/2.2.8 (FreeBSD port www/apache22,
Prefork)
, mod-wsgi 2.1 and 2.X revision also leaks with this symthoms.
Original comment by sch...@gmail.com
on 19 Aug 2008 at 8:34
Oh, Python 2.5.2 (r252:60911, Aug 1 2008, 19:27:55)
[GCC 4.2.1 20070719 [FreeBSD]] on freebsd7
Original comment by sch...@gmail.com
on 19 Aug 2008 at 8:40
In all likelihood, this is probably going to come down to being memory leaks in
Python itself. In particular, when Python interpreter
instance is destroyed not all memory being cleaned up, so gets leaked when new
Python interpreter instance is created.
One may be able to verify this by finding code in mod_wsgi:
Py_Finalize();
PyThreadState_Swap(NULL);
PyEval_ReleaseLock();
wsgi_python_initialized = 0;
and add after that:
Py_Initialize();
Py_Finalize();
Py_Initialize();
Py_Finalize();
Py_Initialize();
Py_Finalize();
Py_Initialize();
Py_Finalize();
That is, initialise and destroy the interpreter a few more times. If memory
leakage occurs at a greater rate, then that is the culprit.
Note, there is a chance this may not be enough to trigger problem, as not here
initialising threading within Python interpreter
instance after creating it.
When I'll get a chance I'll do some tests of my own and see if I can get MacOS
X memory leak tools working with it.
Original comment by Graham.Dumpleton@gmail.com
on 20 Aug 2008 at 12:20
FWIW, on MacOS X Tiger (10.4) with operating system supplied Python 2.3 and own
built Apache 2.2.4, I cannot get the parent
Apache process to grow in memory size even after doing dozens of 'restart'
operations.
Running top would see it cycling around within a 4K range.
PID COMMAND %CPU TIME #TH #PRTS #MREGS RPRVT RSHRD RSIZE VSIZE
5268 httpd 0.0% 0:00.19 1 11 36 40K 5.16M 1.97M 31.2M
5268 httpd 0.0% 0:00.48 1 11 36 44K 5.29M 1.82M 36.7M
5268 httpd 0.0% 0:00.42 1 11 36 44K+ 5.29M 1.82M+ 35.6M
5268 httpd 1.2% 0:00.38 1 11 36 40K- 5.30M+ 1.79M- 34.9M+
The variance would be due to unloading of Apache modules and shared libraries
during restart and when top happened to
perform sample.
Interestingly, if I did a graceful restart, the amount of memory in use
actually dropped, although it still cycled around within a
4K range as did more and more graceful restarts and top sampled it at different
points.
5268 httpd 1.3% 0:01.76 1 11 36 28K- 5.32M+ 1.80M- 59.3M+
5268 httpd 0.0% 0:02.88 1 11 36 32K 5.32M 1.83M 76.4M
5268 httpd 0.0% 0:01.86 1 11 36 32K+ 5.31M 1.82M+ 60.9M
Hmmm, it does still actually jump up to 44K again:
5268 httpd 0.0% 0:03.11 1 11 36 44K+ 5.31M 1.86M+ 80.0M
but top doesn't sample it at that very often. So may just be the way that
graceful restart occurs and how that affects top
sampling.
Anyway, never goes over about 44K.
Would have to try on MacOS X Leopard (10.5) and use DTrace instead as can then
get a proper picture of long term memory
usage. Would also be using Python 2.5 on Leopard which may make a difference if
is indeed related to Python version and a
leak in Python.
Original comment by Graham.Dumpleton@gmail.com
on 20 Aug 2008 at 6:05
In code where it does:
/* Initialise threading. */
PyEval_InitThreads();
PyThreadState_Swap(NULL);
PyEval_ReleaseLock();
wsgi_python_initialized = 1;
The PyThreadState_Swap() call may possibly leak a thread state object as the
result of calling the function is ignored. This is not certain
as reference may be to thread state object held and managed by simplified GIL
API functions and therefore not our job to destroy it
anyway.
This pattern of calling was mirroring what mod_python was doing, but already
known that it isn't strictly correct. In work on mod_wsgi
version 3.0, had already been playing with fixing this code up, so would be
interesting to see if a leak occurs in new code.
In current code MallocDebug on MacOSX suggests that 76 bytes is leaked from
somewhere in wsgi_python_init(), but since operating
system version of Python compiled without debugging, can't tell what memory is
for. Need to build a version of Python with debug
symbols.
This is all the memory MallocDebug shows as leaked though. Even if add in extra
calls to:
PyEval_InitThreads();
Py_Initialize();
Py_Finalize();
no additional memory shown as leaked.
Note though that tests down in Apache single process mode and haven't actually
triggered a restart or graceful restart, as still can't
work out how to get Apache to keep using MallocDebug across its initial fork
when not run in single process mode.
Original comment by Graham.Dumpleton@gmail.com
on 20 Aug 2008 at 11:18
Okay, worked out how to have use of MallocDebug stick. Need -DNO_DETACH option.
Thus:
DYLD_INSERT_LIBRARIES=/usr/lib/libMallocDebug.A.dylib
/usr/local/apache-2.2.9/bin/httpd -DNO_DETACH
This sees 40936 bytes in 76 nodes leaked on a restarts from somewhere inside of
Python.
Need to build version of Python with debug symbols so can work out where and
whether it is because mod_wsgi isn't destroying some
Python objects on a restart, or whether they are just leaks inside of Python
itself.
Original comment by Graham.Dumpleton@gmail.com
on 20 Aug 2008 at 11:32
For Python 2.5.1 on Leopard, 98.3 percent of memory leak (~80kb) in Python
relates to memory allocated by:
static void
initsite(void)
{
PyObject *m, *f;
m = PyImport_ImportModule("site");
if (m == NULL) {
f = PySys_GetObject("stderr");
if (Py_VerboseFlag) {
PyFile_WriteString(
"'import site' failed; traceback:\n", f);
PyErr_Print();
}
else {
PyFile_WriteString(
"'import site' failed; use -v for traceback\n", f);
PyErr_Clear();
}
}
else {
Py_DECREF(m);
}
}
In particular, something is incrementing reference to"site" module and then not
decrementing it. Thus when Python interpreter is destroyed, the memory related
to this
module isn't being released.
Although mod_wsgi access "site" module, it only does so in child worker
processes and not in Apache parent process.
Original comment by Graham.Dumpleton@gmail.com
on 20 Aug 2008 at 12:37
commented site module import in mod_wsgi:
if (wsgi_python_path) {
module = FALSE;
// module = PyImport_ImportModule("site");
if (module) {
but it has not changed leak size
Original comment by d.lex...@gmail.com
on 20 Aug 2008 at 8:39
Tried Python 2.3 on another box running Tiger, now I worked out proper way of
running MallocDebug, and it
doesn't show any memory leaks. Thus the memory leak appears to be specific to
Python 2.5 (or at least not
present in 2.3).
Original comment by Graham.Dumpleton@gmail.com
on 22 Aug 2008 at 4:59
Now trying Python 2.5.2 (Py_DEBUG build) on same box as Python 2.3, find no
memory leaks on restarts, but it
does leak 98k on first start of Python interpreter. In this case it appears to
be something read when 'encodings'
module being imported. Not sure whether using Py_DEBUG build is causing memory
leaks that would otherwise
not occur.
Original comment by Graham.Dumpleton@gmail.com
on 22 Aug 2008 at 5:12
The easiest way to avoid this problem would be defer initialisation of Python
until worker/daemon processes are created. The
downside of this is additional start up time for the child process, as before
this the startup would have occurred only once in
parent process. On a modern Intel Core 2 Duo, that startup time for
initialising Python is in the order of 10ms however, so in all
probability this would be noticed in the bigger scheme of things.
Deferring initialisation to child does though open up possibility of mod_wsgi
being generalised to be language neutral and
extracting Python support out into a loadable module of its own. This module
could be loaded in child, although that itself
would incur more noticeable runtime overhead and at that point may not make it
practical. Doing that in child though would
allow different daemon process groups to load different versions of Python
though, which might be an attractive feature,
Original comment by Graham.Dumpleton@gmail.com
on 14 Oct 2008 at 10:43
how can we help you with this feature?
Original comment by d.lex...@gmail.com
on 14 Oct 2008 at 10:46
FWIW, larger Python apps (I know it's the case with Trac and Review Board) has
startup cost *on first request* much higher than 10ms (in order of seconds on my
VPS), so the cost of initializing python at daemon launch time wouldn't be a
big deal
with them -- and no deal at all if you keep 2+ daemon processes running.
Original comment by vsla...@gmail.com
on 14 Oct 2008 at 11:06
The startup cost of the WSGI application itself is not counted here. That 10ms
would be on top of WSGI application startup. So, yes, it would be
swallowed up into overall startup cost for something like Trac, TurboGears or
Django. If though the WSGI application runs in daemon mode
however, that additional startup cost is still born by Apache worker processes,
who may not run any Python code and instead only serve static
files and proxy requests to daemon mode processes.
There is already a WSGIRestrictEmbedded directive which disables execution of
WSGI applications in Apache worker processes, but that only
restricts it and doesn't avoid any Python initialisation occurring in Apache
worker processes for Python. This is because Python authnz hooks
etc can still be used in Apache worker processes even if WSGI applications
themselves can't be used.
What may be able to be done is if WSGIRestrictedEmbedded is set to On, that
mod_wsgi be intelligent about looking for authnz hooks etc, and
only initialise Python in Apache worker process if strictly required. Thus, if
one were absolutely paranoid about overhead, and wasn't using
embedded mode for WSGI applications, or authnz hooks etc, then one could set
WSGIRestrictedEmbedded to On. There could even perhaps be
a 'configure' option to disable authnz hooks etc, so only daemon mode. That or
current --disable-embedded option to configure does more
than equivalent of current WSGIRestrictedEmbedded directive.
BTW, even with 2+ daemon processes running, advisable that you WSGIImportScript
to preload WSGI file so that delay only occurs on process
start and not first request to hit application. This preloading will in
mod_wsgi 3.0 be able to also be triggered by specifying process-group
and application-group as options to WSGIScripAlias directives.
As to progressing deferring of initialisation to sub interpreters, no help
required. I just need to research all that is required and try it. Initially
may make it optional through a directive and if all okay, make default to defer
intialisation, but allow people to enable old behaviour if
required for some reason. As always, just have to find the time. :-)
Original comment by Graham.Dumpleton@gmail.com
on 14 Oct 2008 at 11:21
Original comment by Graham.Dumpleton@gmail.com
on 14 Oct 2008 at 11:21
If Python initialisation is delayed to worker/daemon processes, it would then
run as Apache user, or daemon user, and not
root as currently can occur. This is probably a good thing.
If chroot option used with daemon process however, it will mean that
initialisation performed in context of chroot
environment, thus essential that Python installation in chroot environment
works properly. Note that Python shared library is
linked from outside of chroot environment still though.
One interesting aspect of delaying initialisation is that could add
python-home, python-optimize and py3k-warning-flag
could be added as options to WSGIDaemonProcess, with different daemon process
groups having different settings.
When python-path was introduced for WSGIDaemonProcess, the WSGIPythonPath
directive was ignored for daemon process
groups. Have a similar issue here to decide on. If these additional options
were added to WSGIDaemonProcess, do the
WSGIPythonHome, WSGIPythonOptimize and WSGIPy3kWarningFlag directives get
ignored. Alternatively, make WSGIPythonPath
be inherited instead, with all new ones also inheriting, but add a special
directive to say don't inherit them.
Only other options is to totally break configuration compatibility and make
WSGIDaemonProcess be <WSGIDaemonProcess>
container instead and have options like WSGIPythonHome etc inside.
<WSGIDaemonProcess example>
WSGIPythonHome ...
WSGIPythonPath ...
etc ...
</WSGIDaemonProcess>
This is getting all messy though with different ways of configuring things. :-(
Original comment by Graham.Dumpleton@gmail.com
on 15 Oct 2008 at 12:09
In revision 1111, when mod_wsgi responsible for initialising Python will now be
deferred until child processes.
This isn't currently configurable, ie., can't make it go back to doing it in
parent but way code written it could be
allowed.
Note though that if mod_python also loaded, as it is responsible for
initialising Python, it will still do it in parent
and mod_wsgi will just inherit that.
In making these changes, also noticed that Py_Finalize() wasn't being called in
Apache parent process. This may
not have been getting done to avoid problems when mod_python also loaded, but
this not being done on a
restart when mod_wsgi in control, may itself be a cause of memory leaks.
Deferring initialisation to child
processes still may be better as not guaranteed that Python will not still leak
memory when Py_Finalize() called.
Either way, need to see what happens if Py_FInalize() is done in parent on a
restart.
Original comment by Graham.Dumpleton@gmail.com
on 7 Nov 2008 at 7:00
Adding in Py_Finalize() to destroy interpreter in parent process reduces memory
leak for Python 2.5 from 40936 bytes in 76 nodes to 228 bytes in 3 nodes. So
Python still leaks something, but not as much. This
change committed at revision 1112.
Note that with Python initialisation being done in child now, this means no
longer done as root but done as
Apache user, or if defined for daemon processes, that user. Also for chroot,
will be done in context of chroot
directory, except that the shared library/framework used is actually drawn from
outside of chroot directory.
Original comment by Graham.Dumpleton@gmail.com
on 7 Nov 2008 at 10:15
The delaying of Python initialisation to the child process also opens up the
possibility of not actually doing the
initialisation in an Apache child worker process until the first request
arrives that actually requires Python to be
running. By doing this, if using mod_wsgi daemon mode only you avoid altogether
initialising Python in the
Apache child worker process unless you really require it. This would cut down
on amount of memory in use by
Apache child worker processes when not being used to handle Python requests.
This would however cause first request to do more work and take a bit longer,
plus would need to hold up any
parallel requests requiring Python that come at the same time while Python is
being initialised. In general case this
possibly acceptable. For a particular site, if they were concerned about
startup time they should be doing
preloading, which would force Python interpreter to be initialised on process
start as well as application code being
loaded as well.
When initialisation is done could be handled by a directive called something
like WSGIPythonInitilization. This
could be defined as 'Parent', 'Child' or 'Request'. Default would be 'Child',
although if WSGIRestrictedEmbedded,
would better of defaulting to 'Request' as that option would mean no content
handlers and Python would only be
required if doing authnz stuff or dispatch functions. Default would also be
'Request' if use of embedded mode for
content handlers disabled at compile time.
Original comment by Graham.Dumpleton@gmail.com
on 7 Nov 2008 at 10:34
Note though that delaying Python initialisation to request time for mod_wsgi
daemon would be pointless given
that the only reason the process exists is to handle Python requests.
May be better to just have 'Parent' and 'Child' and that when
WSGIRestrictEmbedded or embedded mode disabled
at compile time, then for Apache child worker process do it at request time on
presumption that less likely to be
required if embedded mode disabled for content handlers.
Original comment by Graham.Dumpleton@gmail.com
on 7 Nov 2008 at 10:40
Backported fix which ensures that Python interpreter destroyed properly in
Apache parent process to mod_wsgi
2.X (2.4) branch. This is in revision 1140.
Original comment by Graham.Dumpleton@gmail.com
on 27 Dec 2008 at 10:42
We are now using 3.X, and no problems with leak. I'll try 1140 rev to confirm
changes.
Original comment by d.lex...@gmail.com
on 27 Dec 2008 at 10:53
Original comment by Graham.Dumpleton@gmail.com
on 16 Mar 2009 at 10:25
Version 2.4 of mod_wsgi now released.
Original comment by Graham.Dumpleton@gmail.com
on 11 Apr 2009 at 10:25
Original issue reported on code.google.com by
d.lex...@gmail.com
on 18 Aug 2008 at 10:13