saalfeldlab / n5-ij

ImageJ convenience layer for N5
BSD 2-Clause "Simplified" License
14 stars 8 forks source link

Threadpool not being "joined" in n5 save()? #64

Closed JohnGrime closed 1 year ago

JohnGrime commented 1 year ago

Hi!

I've noticed some odd behaviour in a Python script that uses pyimagej to generate and run a BeanScript that invokes N5IJUtils.save(); if I pass a threadpool object to N5IJUtils.save(), the enclosing Python process never seems to actually exit - despite control reaching the end of the Python script.

For example, this invocation of N5IJUtils.save() works fine, with the output correctly generated and the enclosing Python script exiting as expected:

try {
  N5IJUtils.save(
      image_object,
      new N5FSWriter(filepath),
      dataset,
      new int[] {128, 128, 64},
      new GzipCompression()
  );
} catch (Exception e) {
  print(e);
}

However, this invocation results in the process never ending, despite the output being written and control reaching the end of the enclosing Python script:

try {
  N5IJUtils.save(
      image_object,
      new N5FSWriter(filepath),
      dataset,
      new int[] {128, 128, 64},
      new GzipCompression(),
      Executors.newFixedThreadPool(nthreads)
  );
} catch (Exception e) {
  print(e);
}

It seems like the threadpool spawns some sort of rogue background threads or process that's never being join'd?

Thanks!

bogovicj commented 1 year ago

Hi @JohnGrime ,

Quick answer, can you please try to do this (manually shutdown() your ExecutorService):

exec = Executors.newFixedThreadPool(nthreads);
try {
  N5IJUtils.save(
      image_object,
      new N5FSWriter(filepath),
      dataset,
      new int[] {128, 128, 64},
      new GzipCompression(),
      exec
  );
} catch (Exception e) {
  print(e);
}
exec.shutdown();

and let me know if that behaves the way you expect.,

Longer explanation: when an ExecutorService is passed to N5IJUtils, it is the outside caller's job to shut it down.

N5IJUtils.save can not know whether more jobs will be submitted to the ExecutorService it is using or whether calling code inspects it (to do progress monitoring, for example). Therefore, it passes it along to N5Utils which just submits jobs and gets Futures so that it the method returns when every block is saved, but the ExecutorService remains

JohnGrime commented 1 year ago

Works! Thanks, @bogovicj!

JohnGrime commented 1 year ago

Might it be worth adding a note in the docs under the "For Developers" heading?

That section uses a "dangling" threadpool object that's generated inline:

final ImagePlus imp = IJ.openImage( "/path/to/some.tif" );
N5IJUtils.save( imp, 
    new N5Factory().openWriter( "s3://myBucket/myContainer.n5" ), 
    "/myDataset", 
    new int[]{64, 64, 64},
    new GzipCompression(), 
    Executors.newFixedThreadPool( 4 ));
bogovicj commented 1 year ago

Works!

Excellent! :D

Might it be worth adding a note in the docs under the "For Developers" heading?

Agreed - I'll make a note in the example. (Will leave this issue open until that gets done). Thanks for the feedback!

cmhulbert commented 1 year ago

PR #66 address some similar issues (though as far as I can tell, not directly related to this)

bogovicj commented 1 year ago

README updated: https://github.com/saalfeldlab/n5-ij/commit/bbd935395f1476ecbe1ebbcd24bcf877e05c77a5

closing this issue.