LibraryOfCongress / bagit-java

Java library to support the BagIt specification.
Other
73 stars 49 forks source link

ExecutorService causes application to hang on termination #94

Closed rvanheest closed 7 years ago

rvanheest commented 7 years ago

In one of our (commandline) applications that is using BagIt as a dependency, we noticed the application hanging on termination. After hanging for exactly 60 seconds the process is killed automatically by the JVM. This behavior occurred after upgrading to BagIt 5.x.

I have a suspicion that the finalize method in PayloadVerifier (and BagVerifier) are causing the hanging behavior. When running the application with a debugger attached, the finalize methods of either class are never called... Could it be the case that this ExecutorService therefore leaves a number of threads open, which block the termination process?

Full disclosure: the project I'm talking about is demonstrating this behavior since https://github.com/DANS-KNAW/easy-bag-store/pull/29 (or this commit), which upgrades the code to BagIt 5.x. The specific call that causes this behavior can be found here.

If you need more information or you want me to test something, please let me know! I'll be happy to help out.

johnscancella commented 7 years ago

Thanks for submitting this, and your interest in the bagit-java library!

It looks like you are recreating the BagVerifier for each iteration in your loop(https://github.com/rvanheest-DANS-KNAW/easy-bag-store/blob/2252c14bc456fde4cb8c4429cd9ff9dc32ff7794/src/main/scala/nl.knaw.dans.easy.bagstore/BagFacadeComponent.scala#L95)

However, the BagVerifier and the other classes like it are meant to be singletons. Thus it is really inefficient to be recreating them in every iteration in your for loop.

rvanheest commented 7 years ago

Hi! Don't be misguided by the for in there. It's a for-comprehension as it is called in Scala. Basically it's syntactic sugar for monadic composition (operations like flatMap, map and filter) and here we're operating in a Try context (Scala's way of doing exception handling). So there is only one pass through and the BagVerifier is only created once.

While debugging I also created the instance at the start of this method, outside the for-comprehension, but that didn't matter.

Besides that, the enclosing method is only called once in the whole application run.

johnscancella commented 7 years ago

You are still creating multiple instances of BagVerifier, see line 95 and line 101.

You can also try using a different ExecutorService since by default it uses [CachedThreadPool](https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newCachedThreadPool()), which you can see from the documentation waits for 60 seconds of idle time before killing a thread.

rvanheest commented 7 years ago

Ah, I see what you mean! Here is a change to fix that in my code: I removed both instances of the BagVerifier and made it a lazy val, which is only instantiated when required for the first time. Now both places share the same instance of the BagVerifier. However, when testing this, the behavior of hanging for 60 seconds still remains. This isn't a solution! I guess it is all due to the fact that the ExecutorService/CachedThreadPool is never closed by calling shutdown. Most likely the finalize is never called...

johnscancella commented 7 years ago

There is nothing to stop you from calling shutdown() or shutdownNow() yourself since you can access the threadpools (by design).

Something like BagVerifier.getExecutor().shutdown() and BagVerifier.getManifestVerifier()getExecutor().shutdown()

However, I would find it highly inconsistent that Scala would forget to call the finalize method. More likely you are bumping into the fact that you can't force a thread to exit instantly on the JVM. https://stackoverflow.com/a/5242382/971169

rvanheest commented 7 years ago

You're right, when calling both shutdowns manually, the hanging behavior doesn't show up. However, I don't think it is good API design when you require your users to do so. It requires knowledge of the internal workings of the API, that preferably should be hidden for the user. It would be better to implement Closable or AutoClosable on these classes, such that the user can close the resources using a 'try-with-resource' construct.

For example:

in PayloadVerifier

@Override
public void close() throws Exception {
  executor.shutdown();
}

and for BagVerifier

@Override
public void close() throws Exception {
  executor.shutdown();
  manifestVerifier.close();
}

In defense of Scala, this language translate to Java Bytecode and is ran by the same JVM as Java code would be. So it can't be that. I still think it is the use of finalize. It is generally good practice to not override this method or rely on it being called! See for example Effective Java Item 7 and http://javarevisited.blogspot.nl/2012/03/finalize-method-in-java-tutorial.html. These say there is no guarantee that (or when) the finalizer is being called!

I got a temporary solution for now (https://github.com/DANS-KNAW/easy-bag-store/pull/30/commits/402108fc01db562aa17cf1ec508fd6f872d7691e), but I hope there will be a cleaner solution in a future version of this library.

johnscancella commented 7 years ago

Good point, in the next release it will use the AutoClosable interface which should fix your problem. Attached is the bagit-java library with the change. bagit-5.0.0-Jul-28-2017_13-48-32-SNAPSHOT.jar.txt

rvanheest commented 7 years ago

Hey, this looks great. Tested it with the provided jar and it works perfectly. The application terminates properly without hanging!

Thanks for the quick responses again. Always a pleasure to work with you!

johnscancella commented 7 years ago

great to hear. I will make a new release for this then (5.0.3)