Speed of "Precomputing fused input images"

Hi!

I'm a regular user of this fantastic toolbox.

I'm using a very strong GPU node with four GPUs, and each GPU has 40GB, which sums up to 160GB. The deconvolved image has either four or six views of size ~2k1k200 pixels represented by uint16, which is fused into a single image of ~4GB.

When I use the Precomputing fused input images, this step either runs very fast (~20 seconds), or very slow (~30 minutes). The fast option happens only when it is the first image in the time-lapse. The remaining images (i.e. time points 2..end) will necessarily be the slow ones.

Since deconvolutions are computed independently, my sense is that there may be a GPU memory flush problem. Is there any way to check that, or can you instead propose a simple fix for this?

Many thanks! Tomer

PreibischLab / BigStitcher

Speed of "Precomputing fused input images" #115