Don't wait for generate_missed_events() to finish before starting FSMonitor

wimleers commented 13 years ago

Title says it all. This will prevent a long initial wait until the system is running.

What needs to change to support this? Simply changing the code to first start FSMonitor (either FSMonitorInotify or FSMonitorFSEvents) and then calling FSMonitor.generate_missed_events() is insufficient, because it may cause a file that was created while File Conveyor was not running, but was modified after it was running (and thus this change was detected by either FSMonitorInotify or FSMonitorFSEvents), would cause this file to be processed by File Conveyor first through inotify/FSEvents (as if it were a new file, because it's not yet in the DB with synced files), and then to be overwritten again due to the new event generated by FSMonitor.generate_missed_events().

When implementing this issue, also take #68 into account!

wimleers commented 13 years ago

Related issue: #12.

wimleers commented 13 years ago

Detected & fixed #72 & #73 while working on this.

wimleers commented 13 years ago

Suppose this has been implemented.

Then, imagine the following situation:

While FC was not running, a new file was created: /foo/bar.
The FSMonitor.CREATED event for this file would then need to be generated by FSMonitor.generate_missed_events().
The file is modified before the aforementioned event is generated.
inotify picks this up and generates a FSMonitor.MODIFIED event.

This would then need to be mapped to a FSMonitor.CREATED event, and things could then proceed as in the current implementation. However, it is then possible that FSMonitor.generate_missed_events() eventually generates the FSMonitor.CREATED event:

it doesn't generate this event if the file modification event detected by inotify has already propagated to PathScanner's fsmonitor.db
it does generate this event if the file modification event detected by inotify has not yet propagated to PathScanner's fsmonitor.db

When the event has propagated to PathScanner's fsmonitor.db, then it should not result in a call to PathScanner.update_files(), because that would result in a SQL error (in its current form, PathScanner.update_files() does an UPDATE … query, and if no row exists in the DB yet for this file, that would result in a SQL error — thus we'd either need to call PathScanner.add_files() or change PathScanner.update_files()).

Hence, special care is necessary.

wimleers commented 13 years ago

Oops, I accidentally closed #69 through the commit that fixes this issue: https://github.com/wimleers/fileconveyor/issues/68#commits-ref-771adab.

Commit 771adab fixes this issue!

wimleers / fileconveyor

Don't wait for generate_missed_events() to finish before starting FSMonitor #69