wimleers / fileconveyor

File Conveyor is a daemon written in Python to detect, process and sync files. In particular, it's designed to sync files to CDNs. Amazon S3 and Rackspace Cloud Files, as well as any Origin Pull or (S)FTP Push CDN, are supported. Originally written for my bachelor thesis at Hasselt University in Belgium.
https://wimleers.com/fileconveyor
The Unlicense
341 stars 95 forks source link

Fileconveyor crashes on every file change with vim #141

Open halburgiss opened 11 years ago

halburgiss commented 11 years ago

This is a brand new installation on Ubuntu 12.04 with Python 2.7 (RS cloud server with ext3 filesystemLinux dbs4 2.6.35.4-rscloud #8 SMP x86_64 x86_64 x86_64 ), following some instructions posted by Rackspace for their CDN. Installation seemed to go fine, the program starts fine (there is no initial sync of files but I can't tell if that is supposed to happen or not), and then if I manually change any file, the following error occurs and program keeps running but seems to be dead. The '4913' is consistent no matter what the real filename is, and should be a alphanumeric filename like functions.php, or settings.php. The path is valid.

http://pastie.org/7467590

halburgiss commented 11 years ago

Update ... I wasn't sure about the file syncing initially, but that is now working when the app is started. I just didn't know enough to set up the rules correctly and followed somebody else's example. The same crash is still occurring though anytime any file is changed (with vim). Still the same 4913 appearing where the filename should be.

halburgiss commented 11 years ago

I have been poking around in the code and playing with this and know what is causing the error. I was using vim to make file changes in my testing. It seems vim creates some kind of short lived tmp file named '4913' (why I don't know :). A short proof of concept is: http://pastie.org/7530640. It creates the tmp file and then does an IN_MOVED_FROM. So its vim specific.

An example shell script to avoid this strangeness:

inotifywait -q -q -r -e 'close_write' --exclude '^\..*\.sw[px]*$|4913|~$' 

Some enlightenment: http://stackoverflow.com/questions/10300835/too-many-inotify-events-while-editing-in-vim

I don't know enough python to get into trouble, or I would have included a patch.

chrisivens commented 11 years ago

That's something I've come across too but haven't tried to fix. Just avoided using vim to edit/test.

Chris

On 14 Apr 2013, at 17:39, Hal Burgiss notifications@github.com wrote:

I have been poking around in the code and playing with this and know what is causing the error. I was using vim to make file changes in my testing. It seems vim creates some kind of short lived tmp file named '4913' (why I don't know :). A short proof of concept is: http://pastie.org/7530640. It creates the tmp file and then does an IN_MOVED_FROM. So its vim specific.

An example shell script to avoid this strangeness:

inotifywait -q -q -r -e 'closewrite' --exclude '^...sw[px]_$|4913|~$' Some enlightenment: http://stackoverflow.com/questions/10300835/too-many-inotify-events-while-editing-in-vim

I don't know enough python to get into trouble, or I would have included a patch.

— Reply to this email directly or view it on GitHub.

halburgiss commented 11 years ago

Similar issue I think caused by WordPress plugin WP Super Cache which seems to create short lived tmp files, then destroys them. I don't think this is hurting anything I am doing since we are not syncing these files. Depending on the cache configuration, this could generate a lot of noise.

# Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File "/usr/local/lib/python2.7/dist-packages/pyinotify.py", line 1511, in run
    self.loop()
  File "/usr/local/lib/python2.7/dist-packages/pyinotify.py", line 1497, in loop
    self.process_events()
  File "/usr/local/lib/python2.7/dist-packages/pyinotify.py", line 1293, in process_events
    watch_.proc_fun(revent)  # user processings
  File "/usr/local/lib/python2.7/dist-packages/pyinotify.py", line 937, in __call__
    return _ProcessEvent.__call__(self, event)
  File "/usr/local/lib/python2.7/dist-packages/pyinotify.py", line 662, in __call__
    return meth(event)
  File "/usr/local/bin/fileconveyor/fsmonitor_inotify.py", line 266, in process_IN_MODIFY
    self.__update_pathscanner_db(event.pathname, FSMonitor.MODIFIED)
  File "/usr/local/bin/fileconveyor/fsmonitor_inotify.py", line 211, in __update_pathscanner_db
    st = os.stat(pathname)
OSError: [Errno 2] No such file or directory: '/home/clients/example.com/htdocs/wp-content/cache/supercache/www.example.com/blog/2010/08/25/implications-of-facebook-places/631485558516f1d233ebde9.20846877.tmp'
wimleers commented 11 years ago

I agree, #143 is probably a duplicate.

As per that StackOverflow link, I'm wondering if changing this in fsmonitor_inotify.py would fix things:

 FSMonitor.CREATED             : pyinotify.IN_CREATE,

to

 FSMonitor.CREATED             : pyinotify.IN_CLOSE_WRITE,

Also note that there already is code to handle this edge case, in arbitrator.py/__processfilter_queue():

            # The file may have already been deleted, e.g. when the file was
            # moved from the pipeline list into the pipeline queue after the
            # application was interrupted. When that's the case, drop the
            # file from the pipeline.
            touched = event == FSMonitor.CREATED or event == FSMonitor.MODIFIED
            if touched and not os.path.exists(input_file):
                self.lock.acquire()
                self.files_in_pipeline.remove((input_file, event))
                self.lock.release()
                self.logger.info("Filtering: dropped '%s' because it no longer exists." % (input_file))
                continue

But that covers just "short-lived files", not "extremely short-lived files": the crashes outlined here occur in fsmonitor_inotify.py itself!

So I think another potential solution could be to introduce similar logic as the one in arbitrator.py/__processfilter_queue() in fsmonitor_inotify.py.