Closed GoogleCodeExporter closed 9 years ago
See the following:
http://izumi.plan99.net/blog/?p=19
http://izumi.plan99.net/blog/index.php/2007/10/15/making-ruby%e2%80%99s-garbage-
collector-copy-on-write-friendly-part-6-final/
http://izumi.plan99.net/blog/?p=34
http://www.bitwiese.de/2007/09/on-processes-and-threads.html
The posts linked to above refer to Ruby but the same applies to Python, except
some
details about GC optimizations. These illustrate the savings that I expect to
see by
forking after the initialization callable has been executed and all shared
libraries
and modules have been (pre)loaded. mod_wsgi's current use of fork() is already
providing some of these savings but I think there is a room for a lot more.
Original comment by brianlsm...@gmail.com
on 16 Jan 2008 at 1:04
Preloading into a parent process means that you must have a monitor/management
process for every distinct
application, which runs as the user that the final application will run as. You
can't just have one, running as root,
which is used for all applications regardless of what user an application runs
as.
As a consequence you end up with lots more processes for a start. This sort of
scheme, although it may work for
people running a system which is dedicated for a specific set of applications,
is no good in a shared web hosting
environment.
Original comment by Graham.Dumpleton@gmail.com
on 16 Jan 2008 at 5:26
I agree. My idea is to have the preloading script to be per-process-group, not
per-application. Further, the fork would be optimized away for the case where
processes=1.
Original comment by brian@briansmith.org
on 16 Jan 2008 at 1:56
Please disregard my suggestions for preloading before the forking. That is a
totally
separate issue. (FWIW, I am doing a pure-Python prototype of the
preload-then-fork
mechanism as WSGI middleware to test how much private RSS is actually reduced.
I am
also looking at sending patches for Python itself, to switch it from read()ing
module
files to mmap()ing them.)
Original comment by brianlsm...@gmail.com
on 17 Jan 2008 at 3:08
Keeping this here as prompt to look at all these sorts of issues when doing
future restructuring of code and
reviewing what functionality provided. Need to track down the discussions on
mailing list and link them here.
Original comment by Graham.Dumpleton@gmail.com
on 18 Feb 2008 at 9:31
Going to close this issue down for now as not going to pursue the overall
intent of what was being suggested.
Ideas will not be forgotten though.
FWIW, in mod_wsgi 3.0 there is an experimental directive WSGILazyInitialization
which allows one to defer when
Python is initialised. The Python libraries are still linked into Apache
mod_wsgi.so module, but if
WSGIRestrictedEmbedded is enabled and so Python isn't required in Apache worker
processes, Python will not be
initialised in Apache parent. Only time it might be is if the aaa access hooks
in mod_wsgi are used in which case
everything goes back to the way it was.
Because not initialising Python in Apache parent, the worker processes are
smaller, but need to do initialisation
of Python interpreter in every daemon mode process. If only one such process
then okay, but if more than one
then overall memory usage is greater as don't benefit from sharing by
initialising parent in Apache parent.
When support transient daemon processes need to have a separate monitor process
to handle it. At that point, if
that monitor used for all daemon processes, then can initialise Python in it
instead and so get sharing benefits
back for daemon processes, without needing to initialise it in Apache parent.
Original comment by Graham.Dumpleton@gmail.com
on 6 Mar 2009 at 4:42
Original issue reported on code.google.com by
brianlsm...@gmail.com
on 14 Jan 2008 at 1:25