gmethvin / directory-watcher

A cross-platform Java recursive directory watcher, with a JNA macOS watcher and Scala better-files integration
Apache License 2.0
265 stars 34 forks source link

JVM Crash on shutdown #15

Closed caseydawsonjordan closed 4 years ago

caseydawsonjordan commented 6 years ago

Hi, we are experimenting with this system because we desperately need something that works across multiple OS's, so let me first say thank you for making this project public.

Everything works great except on shutdown we get a jvm crash. Would you be willing to help us work through this issue? Thanks!

Disconnected from the target VM, address: '127.0.0.1:62174', transport: 'socket'
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fff97e53fb8, pid=51039, tid=0x000000000000db03
#
# JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build 1.8.0_171-b11)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [CoreFoundation+0xeafb8]  CFMachPortInvalidate+0x58
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# hs_err_pid51039.log
[thread 48131 also had an error]
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
gmethvin commented 6 years ago

Do you have a sample project that exhibits the issue?

gmethvin commented 5 years ago

@caseydawsonjordan Just wondering if you're still seeing this problem, and if you could provide any sample code and steps to reproduce the problem. I'd like to investigate what's causing this.

caseydawsonjordan commented 5 years ago

Hi @gmethvin , I am still experiencing this issue, unfortunately the code is part of a much larger based so I would have to try and extract it to show you. I can however, share a portion of the code to see if you see anything that jumps out at you. Would you be willing to screenshare sometime to take a look together?

gmethvin commented 5 years ago

I'd prefer if you had a code sample I could look at first.

Also, how are you shutting down the watcher? Are you calling close() when you're done with it? If not, does that help?

If you want to discuss more privately you can contact me at the email on my profile.

stuarthendren commented 4 years ago

Hi @gmethvin, I hit this same problem when watchers get created and closed rapidly on the same folder. This simple (junit5) test reproduces the issue on openjdk-13.0.1 and MacOSX Mojave 10.14.6.

  @Test
  public void testCrash(@TempDir Path root) throws IOException {
    DirectoryWatcher directoryWatcher1 = DirectoryWatcher.builder().path(root).build();
    directoryWatcher1.watchAsync();
    directoryWatcher1.close();

    DirectoryWatcher directoryWatcher2 = DirectoryWatcher.builder().path(root).build();
    directoryWatcher2.watchAsync();
    directoryWatcher2.close();
  }

I was able to work around it but I saw the issue here and thought I'd supply the test case incase you want to look at it.

Finding the library very useful, thanks you for open sourcing.

gmethvin commented 4 years ago

@stuarthendren What workaround did you end up finding?

stuarthendren commented 4 years ago

Nothing clever or interesting I’m afraid. The rapid sequence of watch, close, repeat was caused by multiple settings change notifications firing in sequence. I simply stopped that happening by sending one notification for all the settings changing at once. It would certainly still be possible to happen.

gmethvin commented 4 years ago

I think I have an idea of what's going on. When you call close() quickly after calling watchAsync() the watch service probably hasn't finished allocating resources in the other thread. It would probably help to synchronize those operations. Ideally if we haven't started the run loop yet, we should be able to cancel the execution, and if we're in the process of starting when close() is called we should wait for it to finish starting and then shut down.

gmethvin commented 4 years ago

@stuarthendren I think I fixed at least one issue that would occur when rapidly creating and closing watchers. I'm not sure if it will fix every instance of this problem though. I released 0.9.8 if you want to try it out.

stuarthendren commented 4 years ago

That seems to fix it for me. The test case passes and I've not seen any other occurrences which was sometimes happening in my tests.

It wasn't my issue originally but I'd be happy for it to be closed.

Thanks for fixing it.