wimleers / fileconveyor

File Conveyor is a daemon written in Python to detect, process and sync files. In particular, it's designed to sync files to CDNs. Amazon S3 and Rackspace Cloud Files, as well as any Origin Pull or (S)FTP Push CDN, are supported. Originally written for my bachelor thesis at Hasselt University in Belgium.
https://wimleers.com/fileconveyor
The Unlicense
341 stars 95 forks source link

Newly created files that are immediately deleted may cause File Conveyor to crash #40

Closed sgifford closed 13 years ago

sgifford commented 13 years ago

We have svn on a cron doing updates every 1 minute on a directory. That directory is successfully synced to amazon by fileconveyor, however, we keep seeing the following in the logs every few minutes. It borks fileconveyor completely requiring a cron to constantly keep restarting fileconveyor.

It seems that for whatever reason, fsmonitor isn't adhering to the ignoredDirs, however, those svn files are not being pushed to amazon.

<?xml version="1.0" encoding="UTF-8"?>
<config>
  <!-- Sources -->
  <sources ignoredDirs="CVS:.svn">
    <source name="interface" scanPath="/pathtointerface"/>

Traceback (most recent call last):
  File "/pythonlocation/ActivePython-2.5.5.7-linux-x86_64/INSTALLDIR/lib/python2.5/threading.py", line 488, in __bootstrap_inner
    self.run()
  File "build/bdist.linux-x86_64/egg/pyinotify.py", line 1415, in run
    self.loop()
  File "build/bdist.linux-x86_64/egg/pyinotify.py", line 1401, in loop
    self.process_events()
  File "build/bdist.linux-x86_64/egg/pyinotify.py", line 1185, in process_events
    watch_.proc_fun(revent)  # user processings
  File "build/bdist.linux-x86_64/egg/pyinotify.py", line 831, in __call__
    return _ProcessEvent.__call__(self, event)
  File "build/bdist.linux-x86_64/egg/pyinotify.py", line 562, in __call__
    return meth(event)
  File "/opt/fileconveyor/code/fsmonitor_inotify.py", line 164, in process_IN_MODIFY
    FSMonitor.trigger_event(self.fsmonitor_ref, event.path, event.pathname, FSMonitor.MODIFIED)
  File "/opt/fileconveyor/code/fsmonitor.py", line 142, in trigger_event
    self.callback(monitored_path, event_path, event)
  File "/opt/fileconveyor/code/arbitrator.py", line 880, in fsmonitor_callback
    if stat.S_ISDIR(os.stat(event_path)[stat.ST_MODE]):
OSError: [Errno 2] No such file or directory: '/somdir/.svn/tmp/entries'
wimleers commented 13 years ago

File Conveyor does adhere to ignoredDirs, it's just that pyinotify is incapable of ignoring certain directories, hence File Conveyor is doing some very basic checks on everything that pyinotify tells File Conveyor.

The strange thing is that this seems to be happening despite the fact that the code is actually checking if the file still exists:

    # The file may have already been deleted!
    deleted = event == FSMonitor.DELETED
    touched = event == FSMonitor.CREATED or event == FSMonitor.MODIFIED
    if deleted or (touched and os.path.exists(event_path)):
        # Ignore directories (we cannot test deleted files to see if they
        # are directories, because they obviously don't exist anymore).
        if touched:
            if stat.S_ISDIR(os.stat(event_path)[stat.ST_MODE]):
                return

I guess the file must have been created and must still exist (hence passing the first if test), but is then immediately deleted (hence failing at the third if test).

I'll definitely have to make this more robust.

sgifford commented 13 years ago

That makes sense. SVN creates the file for only an instant while it is checking to see what needs to be checked out. It may be that this will have to be solved by catching the exception.