septag / dmon

Single header C99 portable library for monitoring filesystem changes. (Windows/Linux/MacOS)
BSD 2-Clause "Simplified" License
235 stars 19 forks source link

changes inside newly created directory not reported on linux #12

Closed francesco-st closed 3 years ago

francesco-st commented 3 years ago

When a new directory is created inside a folder we watch recursively the directory creation is correctly reported but on linux any change inside the new directory is not reported. The same things works correctly on Windows.

Example:

  1. watch recursively directory /tmp
  2. mkdir /tmp/foo => directory creation correctly reported
  3. touch /tmp/foo/test.txt => file creation not reported

I think that on linux we just need to add a watch on each newly created directory if the recursive flag is enabled.

francesco-st commented 3 years ago

Maybe @jgmdev may have a look at that.

jgmdev commented 3 years ago

I think that on linux we just need to add a watch on each newly created directory if the recursive flag is enabled.

I already implemented that on the PR and it is working properly:

./test mydir
mkdir mydir/subdir
touch mydir/subdir/test.txt

Output:

waiting for changes .. CREATE: [mydir/]mydir/subdir CREATE: [mydir/]mydir/subdir/test.txt

https://user-images.githubusercontent.com/1702572/126503994-178e6b65-616e-4b1b-b857-f5c73dd51f46.mp4

Maybe you tested an old checkout? Also creating new sub directories on lite-xl and then files inside of it is reported properly with latest dmon:

https://user-images.githubusercontent.com/1702572/126502691-622134b1-8937-476a-8da7-c312ae4d614c.mp4

francesco-st commented 3 years ago

Thank you @jgmdev, I will check again on linux.

Maybe the problem is when you create a directory inside the newly created directory but I will check again on my side.

jgmdev commented 3 years ago

Maybe the problem is when you create a directory inside the newly created directory

Tested and works:

waiting for changes .. CREATE: [mydir/]subdir CREATE: [mydir/]subdir/subsub CREATE: [mydir/]subdir/subsub/file.txt CREATE: [mydir/]subdir/subsub/moresub CREATE: [mydir/]subdir/subsub/moresub/test.txt

franko commented 3 years ago

I confirm the issue on linux, is just that I didn't report the correct sequence to reproduce the event.

Actually it is very easy to reproduce the problem:

I think it maybe a race-condition between the OS directory creation and the watch you add on the newly created dir. For the moment you added a watch on "new_dir" you already missed the events for the creation of "new_dir/foo".

I think this is an intrinsic problem of inotify on linux. I don't know if a solution if possible. If inotify doesn't help maybe the only solution would be to internally rescan newly created directories and send events if needed but this is very tricky to implement.

Alternatively we may just leave this as it is and document this possibility so that people implement solution from the application side. Otherwise a solution inside dmon would be much better if possible.

septag commented 3 years ago

I've done a little workaround for the problem you mentioned. Basically, dmon manually scans for child directories when a new directory is created and add them to the events/watch list. Can you please check if this works as expected ? in my experience, inotify behaves a little differently on each distro/version, even sometimes with each run, and is very unpredictable and annoying to work with. Also this might solve the problem for your feature request #13

There is currently one other thing to consider, which is copying/creating sub-directories with files, I can also ease the filtering and add files manually too, but if the sub-directories has many files, it might generate a lot of data and blow-up, so I'm still not sure about including those. what is your opinion ?

franko commented 3 years ago

I've done a little workaround for the problem you mentioned.

Thank you for taking the time to look at that, I appreciate.

Basically, dmon manually scans for child directories when a new directory is created and add them to the events/watch list. Can you please check if this works as expected ? in my experience, inotify behaves a little differently on each distro/version, even sometimes with each run, and is very unpredictable and annoying to work with.

Ok, I tested the new version and for very simple cases it now works. On the other side for complex cases when some directory with a lot of new files are created at once it still fails.

For this latter example I may provide a complete test case but even creating a test case is complex. What I do is:

It turns out we are still missing files. How the application knows we are missing files ? With the dmon events the application incrementally updates an in-memory list of all the files. After a dmon event it schedules a rescan after one second. If the files list after the rescan is different than the list updated using dmon events we know we missing some files events.

In other terms, looking at the timing, what happens is:

In reality it could be the logic in my application that is wrong but performing the same test on Windows I get a perfect match: on windows dmon doesn't miss any file creation event.

Possible explanations

I guess you modification goes in the good direction but getting something robust and accurate using inotify is really hard. Let's examine you timing:

  1. got a inotify event for a new directory
  2. dmon scans the newly created directory for files
  3. dmon adds a watch to the new directory

When you are in (2) but not yet in (3) some files or directory can be created by the OS while you are scanning the directory but you will receive no notifications because you haven't yet set the new watch. On the other hand if you set (3) before (2) we will end up with duplicate create events because some files we scan can be reported by inotify.

Also this might solve the problem for your feature request #13

No, it doesn't.

To be clear, for me now this problem is not important to be fixed. You should just put a warning in the documentation saying on inotify-based system you may miss some events. Something along the lines of what found in the man pages here:

https://man7.org/linux/man-pages/man7/inotify.7.html

 With careful programming, an application can use inotify to
 efficiently monitor and cache the state of a set of filesystem
objects.  However, robust applications should allow for the fact
that bugs in the monitoring logic or races of the kind described
below may leave the cache inconsistent with the filesystem state.
It is probably wise to do some consistency checking, and rebuild
the cache when inconsistencies are detected.

So all the application should be prepared to the fact that, incrementally updating a list of files based on an inotify events is not reliable and additional mechanisms to rescan the files list are needed.

This is what I did in my application so now I am not sensitive to the problem I reported here.

There is currently one other thing to consider, which is copying/creating sub-directories with files, I can also ease the filtering and add files manually too, but if the sub-directories has many files, it might generate a lot of data and blow-up, so I'm still not sure about including those. what is your opinion ?

Well, because of what I said above we have now two way:

  1. minimalist, you undo the change you did and we accept we can miss files creation events when a new directory is created. The watch is added but no attempt is made to read the directory to list new files.
  2. full accurate algorithm, needs a lot of additional work and logic to accurately report all events. We would need to keep a list of files and perform scans of the directory to spot files we may have missed.

In my point of view it is better to stay with (1) because (2) is too complicated. We just need to state that the some events may be missing in the documentation.

franko commented 3 years ago

Closing this issue because it was addressed by your commit, @septag, and the points I raised are sort of philosophical.

I still think that on linux there is still a problem when you have a "burst" of nested directories and files creations/deletion all at once but we may leave this for later if it ever want to work on that.