sankarNarayanan / modwsgi

Automatically exported from code.google.com/p/modwsgi
0 stars 0 forks source link

Implement automated file change reloader in core of mod_wsgi. #140

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The page:

  http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode

explains how a separate thread can be created for monitoring of code file 
changes related to a 
Python web application and automatically trigger a restart of a daemon mode 
process.

This recipe has a number of issues. These are:

1. It has to be implemented in each sub interpreter of a process if you want 
all applications 
covered by it.

2. It restarts the daemon process straight away, rather than on next request 
that targets the 
daemon process as would normally occur where only the WSGI script file had been 
changed.

The second issue above can be problematic in a couple of cases.

The first is that if preloading is being used and the application immediately 
reloaded when 
daemon process restarted, then further changes to the code files, even before a 
request arrives, 
could cause the daemon process to restart again and again.

The second is where the application has a background task running which 
performs work on a 
periodic basis. If no preloading is done, and so application would only be 
loaded on next request, 
that background task will effectively be shutdown as soon as first change is 
made to a file. This 
may not be desirable and may instead be better that process allowed to run 
until really needs to 
be restarted.

The latter problem could be avoided by changing code to not send a signal, but 
search 
sys.modules for all WSGI script files and reset __mtime__ to 0 instead, but 
this will not also cause a 
restart if a new WSGI script file is loaded.

That the reloader isn't a part of mod_wsgi also means that users actually have 
to do something to 
use it. It may be better to implement such a function into the core of mod_wsgi 
instead. Being in 
the core, the issues above could in part be addressed.

When implemented in the core, instead of immediately signalling the process to 
restart, it would 
instead merely set a flag for the whole process. When next request arrives, as 
well as a check 
being made as to whether WSGI file had been changed, a check of this process 
wide flag would 
also be made. If either condition is satisfied the process would be restarted.

The complexity in making it part of the core is that for each iteration, the 
code would need to 
acquire each interpreter in the process and iterate over all modules in 
sys.modules and check 
changes to modification time, or existence of corresponding file. Since 
interpreters would use lot 
of the same files, it would need to keep a table of what files have already 
been checked on each 
pass and avoid doing an unnecessary stat() call.

The overhead of the stat() calls could also be an issue. It has been suggested 
in relation to Django 
run server that its reloader, which checks once a second, can noticeably affect 
battery life of a 
laptop. For Linux at least, the alternative is the inotify functions.

  http://en.wikipedia.org/wiki/Inotify

This allows one to register all the file paths with the kernel and the kernel 
will more intelligently 
and more efficiently monitor for changes. It would still be necessary to do 
something once a 
second, but that is limited to just checking for any newly loaded modules and 
registering the paths 
for them.

MacOS X has FSEvents framework:

  http://en.wikipedia.org/wiki/FSEvents

but it only allows one to monitor for changes under a specific directory 
hierarchy and notifications 
only indicate a change was made and not what file was changed. Thus would 
appear not to be 
useful.

Strategy could then be to use inotify on Linux if available and otherwise 
fallback to doing stat() 
calls oneself.

Because of the overhead the feature should not be on by default. It might 
instead be enabled 
through a new option to WSGIDaemonProcess, perhaps called something like 
reloader-interval. By 
default it would have value 0, indicating that reload only done when WSGI 
script file touched. If not 
0, then is the number of seconds between checks of files. In the case of 
inotify, it would be time 
between refreshing file subscriptions and change made still be detected in 
between that due to it 
relying on poll() on special file descriptor rather than stat() while doing the 
loop.

Original issue reported on code.google.com by Graham.Dumpleton@gmail.com on 27 Mar 2009 at 5:15

GoogleCodeExporter commented 8 years ago
Links related to inotify and its use in Django autoreloader.

http://code.djangoproject.com/ticket/9722
http://pyinotify.sourceforge.net/

The pyinotify module could be used now in a variant of reloader in mod_wsgi 
documentation.

Original comment by Graham.Dumpleton@gmail.com on 27 Mar 2009 at 5:44

GoogleCodeExporter commented 8 years ago
The system of stating could be optimised. First off, when a change has been 
found and flag set to indicate restart 
required, then can stop doing the check for the remaining period until a 
request arrives as already know a restart is 
required.

The restart flag though should perhaps be per interpreter, as there is no need 
for a request against a different 
interpreter to trigger a full restart if nothing in that interpreter has 
changed. This doesn't mean you still have to do 
the periodic checks, but you can skip interpreters which are already flagged as 
in restartable condition.

In the case of a process which isn't receiving any requests at all, there is no 
point doing the checks. Thus, if some 
multiple of the check interval has passed without any requests having been 
received by the process, then can stop 
performing any checks. The next time a request arrives though, a check should 
be forced then. This check though 
would run in the context of that request and so request would be delayed 
momentarily while check done and it 
determined if restart required. When doing this manual check, would be 
necessary to hold up any other requests 
which come in at the same time. May sound bad, but remember that this is a 
process which hadn't received any 
requests for a while so otherwise was idle. If determined that no restart 
required, proceed as normal and restart the 
periodic checks until change detected or process becomes idle again, at which 
time we stop once more.

Original comment by Graham.Dumpleton@gmail.com on 27 Mar 2009 at 9:32

GoogleCodeExporter commented 8 years ago
This is being done by mod_wsgi express in 4.X.

Original comment by Graham.Dumpleton@gmail.com on 17 Sep 2014 at 3:50