gmethvin / directory-watcher

A cross-platform Java recursive directory watcher, with a JNA macOS watcher and Scala better-files integration
Apache License 2.0
264 stars 34 forks source link

Unable to detect file deletion events #55

Closed andregasser closed 1 year ago

andregasser commented 3 years ago

Hello Greg and community,

This is maybe more of an informal request rather than a real issue. But maybe you are able to provide some info on this nonetheless.

I am currently trying to integrate your directory watcher 0.10.0 into one of our software components. Integration was quite easy but while running our test suite, I've noticed that file deletion events somehow do not get propagated to our application, while at the same time, file creation events and file modification events work without issues. We cannot really explain this behaviour to ourselves.

We run your directory watcher in a separate thread and start it using the .watch() method.

I've built a minimal, single-threaded POC watching a directory. And there, everything seemed to be working normal (all events received, even the deletion events).

I am not sure what could cause those deletion events to not being generated. Do you have any idea? We're running on Java 13 and macOS Catalina 10.15.6 and macOS 11 Big Sur beta.

Any feedback highly appreciated.

Thanks! André

gmethvin commented 3 years ago

Have you tested on other platforms? If you tested on Linux, for example, then that would suggest the problem is with the macOS watching code, either in this library or perhaps some quirk of the OS itself.

Ideally we can come up with a small test case that exhibits the error. What differs between your app and the code in your POC?

andregasser commented 3 years ago

Hello Greg,

Thanks for your reply (and sorry for my late answer). I need to do some more testing on this. I will try to reproduce the issue in a simple test case. We use Windows and macOS.

Will get back to you (hopefully soon).

Cheers, André

andregasser commented 3 years ago

Hello Greg,

I am unfortunately unable to perform testing on this at the moment - sorry for that. There's just too much other work to be done by myself... :-( Should I find another occasion to do it I will get back to you. But for the mean time I think it is best to close this issue (up to you).

Have a good time. André

PhilGal commented 3 years ago

Hi @gmethvin I investigated the similar issue with the missing "deleted" events on OSX and here's the conclusion.

If the fileHashing is enabled, and the DELETED event is the first one triggered, we follow else brach in this path:

            if (fileHasher == null) {
              boolean isDirectory = directories.remove(childPath);
              // hashing is disabled, so just notify on the path we got the event for
              onEvent(EventType.DELETE, isDirectory, childPath, count, rootPath);
            } else {
              // hashing is enabled, so delete the hashes
              Set<Path> subtreePaths = PathUtils.subMap(pathHashes, childPath).keySet();
              for (Path path : subtreePaths) {
                boolean isDirectory = directories.remove(path);
                onEvent(EventType.DELETE, isDirectory, path, count, rootPath);
              }
              // this will remove from the original map
              subtreePaths.clear();
            }

and then Set<Path> subtreePaths = PathUtils.subMap(pathHashes, childPath).keySet(); returns an empty Set and onEvent is never called. As far as I can see, the pathHashes are only populated in CREATE and MODIFY branches.

I set .fileHashing(false) and everything became fine.

J-N-K commented 2 years ago

Maybe it makes sense to hash the content of directories on registering them? This would ensure that always events are emitted for the content of deleted directories.

Otherwise the issue could be mitigated by checking if subtreePaths is empty and then emit the event like in the fileHasher == null case.

gmethvin commented 2 years ago

@J-N-K I don't recall if there was a reason not to hash the contents of directories on registering them. I think it was a performance optimization, as the hashing originally was meant to catch duplicate events, and for that purpose it wasn't necessary to hash on startup. But now it seems like we might need those hashes to handle deletion events properly.

If you're interested I'm happy to accept a PR.

gmethvin commented 2 years ago

Actually @J-N-K I forgot that we are hashing the contents of directories when we register them, in initWatcherState: https://github.com/gmethvin/directory-watcher/blob/4dbe16272da494ac1c7043307f3c22a185b49040/core/src/main/java/io/methvin/watcher/PathUtils.java#L66. Do you see something missing in the logic?

kwin commented 1 year ago

I am experiencing the same thing (i.e. file deletions not detected) and tried to add a test but for me test execution of existing tests with sbt already fails:

sbt test

[info] Run completed in 3 minutes, 35 seconds.
[info] Total number of tests run: 2
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 0, failed 2, canceled 0, ignored 0, pending 0
[info] *** 2 TESTS FAILED ***
[error] Failed: Total 32, Failed 22, Errors 0, Passed 4, Skipped 6
[error] Failed tests:
[error] Failed tests:
[error]     io.methvin.better.files.RecursiveFileMonitorSpec
[error]     io.methvin.watcher.changeset.ChangeSetTest
[error]     io.methvin.watcher.DirectoryWatcherFileTreeVisitorTest
[error]     io.methvin.watchservice.DirectoryWatcherTest
[error]     io.methvin.watcher.DirectoryWatcherOnDiskTest
[error] (directory-watcher / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] (directory-watcher-better-files / Test / test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 216 s (03:36), completed Oct 12, 2022 3:21:08 PM

This happens with sbt 1.7.1 (Azul Systems, Inc. Java 1.8.0_345) on Mac OS 12.6 on M1 Pro (Apple Silicon) Either those are false positives or indeed indicate that things don't work as expected on this particular OS version. Here are the results: TestResults.zip

@gmethvin With which Mac OS system are you usually testing?

gmethvin commented 1 year ago

@kwin is there any other output from the tests? do you see any error messages?

I tested with the same macOS (12.6) and JDK (Zulu 1.8.0_345) and everything passes for me.

For comparison I also tested locally on JDK 17 and also had no issue.

On the CI in GitHub actions we are using macos-latest. Not sure exactly what version that is.

simongray commented 1 year ago

I am using https://github.com/nextjournal/beholder which is a downstream consumer of this library. I do not receive delete events either unless the file was created after starting the DirectoryChangeListener. I am using Java 18, macOS 12.6, and Apple Silicon (just like @kwin).

Perhaps this is an Apple Silicon-related issue rather than a macOS issue?

gmethvin commented 1 year ago

@simongray thanks, it was helpful that you mentioned you do not receive delete events for files created before starting the watcher. It seems like our existing test cases for file deletion only considered files created after the watcher was started. I created a pull request with some additional tests: https://github.com/gmethvin/directory-watcher/pull/84.

In that PR I'm seeing the deleteAlreadyExistingFiles* tests failing across all platforms, which suggests the issue is not specific to macOS or Apple Silicon.

I haven't had a chance to investigate the cause yet, but now we have a failing test it should be easier to understand what's going on.

simongray commented 1 year ago

That's great to hear, @gmethvin!

gmethvin commented 1 year ago

I found an issue that happens if you initialize a DirectoryWatcher then call watch/watchAsync later after new files have been created. The watcher state is initialized in the constructor, but not updated until you actually start watching, so it can miss files that are created after the watcher is created but before it starts watching. Then if the deletion event comes in later it will see the file does not exist in its internal data structure and assume it is a duplicate.

I don't know if this is the exact issue everyone on this ticket is seeing though. This might happen in other situations where DirectoryWatcher's internal data structure gets into an inconsistent state for some other reason.

gmethvin commented 1 year ago

@simongray I took a look at beholder; it appears to be creating the watcher right before it starts watching, which means it should know about any files that exist at that time. So it doesn't seem like the issue you're seeing is the same as the one I just found.

simongray commented 1 year ago

If you release a new version, I can try to work out what's going wrong in beholder using that version instead.

gmethvin commented 1 year ago

I just released 0.17.0 with the fix in the above PR.

If that doesn't fix it, can you provide some sample code for a case where it doesn't work?

simongray commented 1 year ago

I just tested it using 0.17.0 and the same issue is present.

@gmethvin can you provide some sample code for a case where it doesn't work?

... in the beholder source code using Clojure Java interop:

(-> (DirectoryWatcher/builder)
    (.paths (map to-path paths))
    (.listener (fn->listener cb))
    (.build))

A DirectoryWatcher instance is built with a list of paths (just one item in my case) and a DirectoryChangeListener provided to the listener() method. Starting the process with watchAsync() and watch() exhibits the same behaviour, i.e. file deletions are not registered unless the file was created after starting the watch process.

gmethvin commented 1 year ago

Can you write a test that exhibits that behavior? I think what it's doing is basically the same as what I'm doing in the DirectoryWatcher tests, and those work fine.

simongray commented 1 year ago

I guess you prefer it as Java? It's been a while...

gmethvin commented 1 year ago

Really any language will work for now. If you can write a program that reproduces the issue and produces a clear pass/fail output, then we an easily run it in different environments to better understand the problem. Once we do that we can figure out how to translate that program into a test on this project and hopefully a fix.

simongray commented 1 year ago

@gmethvin I hope this helps: https://github.com/simongray/beholder-deletion-issue

gmethvin commented 1 year ago

Thanks @simongray. That was helpful, though after looking at this some more I technically shouldn't have needed your script to reproduce the issue. I noticed that the beholder library uses the default settings, which enables file hashing, and the test script I had been using does not enable hashing.

It appears that when hashing is enabled, DirectoryWatcher sees the DELETE event but ignores it, probably because the hash data structure is not in the state it expects. When hashing is disabled, I do see the event come through as expected.

For my own testing I wrote this ammonite script:

import $ivy.`ch.qos.logback:logback-classic:1.3.4`
import $ivy.`io.methvin:directory-watcher:0.17.0`

import io.methvin.watcher._
import java.nio.file._

@main
def main(dir: String, hashing: Boolean = false) = {
  def zzz = Thread.sleep(1000)
  def msg(str: String) = println(s"\n$str\n")

  val watchdir = Paths.get(dir)
  if (!Files.exists(watchdir)) Files.createDirectory(watchdir)

  def create(name: String) = {
    msg(s"CREATING $name")
    Files.write(watchdir.resolve(name), name.getBytes)
  }
  def delete(name: String) = {
    msg(s"DELETING $name")
    Files.delete(watchdir.resolve(name))
  }

  create("before-watch")
  zzz
  println(s"Starting watcher on $watchdir with hashing $hashing")
  DirectoryWatcher.builder()
    .path(watchdir)
    .fileHashing(hashing)
    .listener(ev => msg(s"LISTENER: ${ev.eventType} ${ev.path}"))
    .build()
    .watchAsync()
  zzz
  create("after-watch")
  zzz
  delete("before-watch")
  zzz
  delete("after-watch")
  zzz
  println("done")
}

when I run without hashing:


CREATING before-watch

Starting watcher on testdir with hashing false
02:59:39.837 [main] DEBUG io.methvin.watcher.DirectoryWatcher - Registering [testdir].

CREATING after-watch

02:59:41.024 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_CREATE [/Users/greg/scripts/testdir/after-watch]
02:59:41.026 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> CREATE [/Users/greg/scripts/testdir/after-watch] (isDirectory: false)

LISTENER: CREATE /Users/greg/scripts/testdir/after-watch

02:59:41.026 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_MODIFY [/Users/greg/scripts/testdir/after-watch]
02:59:41.026 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> MODIFY [/Users/greg/scripts/testdir/after-watch] (isDirectory: false)

LISTENER: MODIFY /Users/greg/scripts/testdir/after-watch

DELETING before-watch

02:59:42.022 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_DELETE [/Users/greg/scripts/testdir/before-watch]
02:59:42.023 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> DELETE [/Users/greg/scripts/testdir/before-watch] (isDirectory: false)

LISTENER: DELETE /Users/greg/scripts/testdir/before-watch

DELETING after-watch

02:59:43.027 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_DELETE [/Users/greg/scripts/testdir/after-watch]
02:59:43.027 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> DELETE [/Users/greg/scripts/testdir/after-watch] (isDirectory: false)

LISTENER: DELETE /Users/greg/scripts/testdir/after-watch

done

and with hashing:


CREATING before-watch

Starting watcher on testdir with hashing true
02:59:51.104 [main] DEBUG io.methvin.watcher.DirectoryWatcher - Registering [testdir].

CREATING after-watch

02:59:52.211 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_CREATE [/Users/greg/scripts/testdir/after-watch]
02:59:52.213 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> CREATE [/Users/greg/scripts/testdir/after-watch] (isDirectory: false)

LISTENER: CREATE /Users/greg/scripts/testdir/after-watch

02:59:52.213 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_MODIFY [/Users/greg/scripts/testdir/after-watch]

DELETING before-watch

02:59:53.212 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_DELETE [/Users/greg/scripts/testdir/before-watch]

DELETING after-watch

02:59:54.219 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - ENTRY_DELETE [/Users/greg/scripts/testdir/after-watch]
02:59:54.219 [ForkJoinPool.commonPool-worker-3] DEBUG io.methvin.watcher.DirectoryWatcher - -> DELETE [/Users/greg/scripts/testdir/after-watch] (isDirectory: false)

LISTENER: DELETE /Users/greg/scripts/testdir/after-watch

So, at least in this particular test, I'm able to reproduce the issue with hashing enabled. This could be because of an issue calculating the initial hashes, but I'll need to investigate more to understand why.

I would suggest setting .fileHashing(false) to see if that changes the behavior of your test script. That would at least confirm the behavior is consistent with what I'm seeing.

gmethvin commented 1 year ago

I think I discovered the issue.

This happens when two conditions are true:

  1. You pass one or more relative paths to the builder.
  2. Hashing is enabled.

Basically the issue is that the watcher initialization code uses the paths as hash keys as given—even if they are relative—but it is comparing against the absolute paths returned by the underlying watch service. The solution is to convert the paths to absolute before initializing the hash map.

gmethvin commented 1 year ago

@simongray I released 0.17.1 with the relative paths fix so you can try that and see if that fixes the issue for you.

simongray commented 1 year ago

Thanks, will check as soon as I have a moment :)

simongray commented 1 year ago

@gmethvin I can confirm that this has fixed the issue! Thank you so much.

gmethvin commented 1 year ago

Thanks @simongray for confirming!