large 3D arrays take long for deconvolution

pr4deepr commented 2 years ago

Hi Thanks for making this plugin available. When I use RedLionfish on large 3D arrays, the deconvolution takes extremely long. I've included an example notebook here for reference: https://github.com/BioimageAnalysisCoreWEHI/napari_lattice/blob/master/notebooks/deconvolution_RedFishLion.ipynb

From what I can tell, it looks like it chunks it into very small 3D arrays. I had the same issue when trying it on a GPU with 10GB memory.

Cheers Pradeep

perdigao1 commented 2 years ago

Just checking now

perdigao1 commented 2 years ago

What is the size of the PSF (488.tif) ? It says is simulated. If you could share that file it would also be great.

One thing you can also try is to add the following lines in your jupyter notebook before running the code

import logging
logging.basicConfig(level=logging.INFO)

it will output more information about how the calculation is progressing

pr4deepr commented 2 years ago

Sorry here is the PSF: https://github.com/BioimageAnalysisCoreWEHI/napari_lattice/blob/master/sample_data/psfs/zeiss_simulated/488.tif

It was taking more than 2 hours for the large 3D array. It still didn't finish. Will run it with the logging and report back..

perdigao1 commented 2 years ago

Ok. I know now what is happening. This is trying to run in a laptop with NVIDIA 1650 4Gb.

RedLionfish tries to do the RL using the GPU, but it does envcounter a problem due to the size of your data

data has shape (834, 300, 2048) psf has shape (93, 205, 205)

Redlionfish tries to do EL iterations, using the FFT routines and using the data and PSF as they are. It encountours MEM_OBJECT_ALLOCATION_FAILURE, so it automatically switches to block deconvolution mode.

It tries with blocksize=512, and sets blockshape to [512, 300, 512], and it fails, but I am not sure why. I will investigate

It then tries blocksize 256 and sets blockhape to [256, 256, 256]. This is ok but in the x-axis, the psf has width 205, which 205*1.2 = 246, which means that the stride ber block calculation is only 10 pixels. As such it will perform (2048-256) / 10 calculations in a single z, which is too much.

pr4deepr commented 2 years ago

Thanks for this. Do you recommend the data and psf shapes to be padded a certain way? i.e. should the data shape be divisible by the PSF shape?

perdigao1 commented 2 years ago

Trying on a laptop, I did encounter a few more issues.

To summarise, the major reason for failing to calculate the RL is the size of the PSF. In fact, internally for doing convolution by FFT, the PSF has to be resized to the same size as the data to be convoluted. Because of the size of your original data with out of memory error almost immediately. This leads to the block deconvolution, which divides the data in blocks and runs the convolutions with smaller size (and smaller resized PSF). If that fails, it reduces the block size again, and again until is comfortable. In my laptop, it only reached comfortable block size [256,256,256]. More powerful PC's should handle better than this. At 256, it almost the size of the PSF, so it leads to the problem of the small step sizes.

For this particular software, it works best if your PSF is as small as possible, so that, if it needs to calculate using block convolution, the step size will be larger. The stepsize= blocksize - 1.2*psfsize.

I changed the code so that if the stepsize is too small, it reports fail, and reverts to CPU calculation.

If access to powerful computers is limited, you may want to try google colab or similar, with gpu enabled. They have perhaps better RAM and GPU available for testing.

pr4deepr commented 2 years ago

Thanks. What about image size? Does having the dimensions as an even number or even as a multiple of block sizes (256,512 etc.) help?

perdigao1 commented 2 years ago

In principle yes, because it uses FFT internally, and the power-of-2 algorithm (Cooley-Tukey) is faster than the non-power-of-two (Bluestein's, that's what Reikna package uses, which halves perfomance) http://reikna.publicfields.net/en/latest/api/computations.html

pr4deepr commented 2 years ago

Thanks for this @perdigao1 Another question, I tried padding the images with multiple of 256,16 and also tried other just even numbers. The deconvolved image has the same shape, but only has a small portion of the original image in each slice. I've updated the notebook to show this with napari screenshots at the end: Its in the section where I use the larger datasets. https://github.com/BioimageAnalysisCoreWEHI/napari_lattice/blob/master/notebooks/deconvolution_RedFishLion.ipynb

This seems to change with the padding I apply to the image. I'm not sure whats happening here..

perdigao1 commented 2 years ago

Hi, I think you used version 0.6 which had that bug that I spotted after trying you data. I resolved in version 0.7, maybe try the new version if you haven't. There will be a few more updates soon.

I will try it and see if I can figure out the problem

pr4deepr commented 2 years ago

Thanks @perdigao1 So, its working now and I've been experimenting with different tiling strategies. Tiling the PSF to multiple of 256 seems to be helping. However, regardless of the tiling I use, I am getting tiling artefacts. I've updated the notebook and tested this on a better GPU as well. https://github.com/BioimageAnalysisCoreWEHI/napari_lattice/blob/master/notebooks/deconvolution_RedLionFish.ipynb Example screenshot of deconvolved data:

This is original raw data:

Artefact on the top of the image and between tiles or overlapping areas.

perdigao1 commented 2 years ago

Hi, I looked into your solution and tried a few other things. I realised that the way RedLionfish was doing the RL deconvolution was not right. It does indeed use the block deconvolution because the data is too large for OpenCL to handle. I made many revisions.

The data that you show here, it was not doing the padding correctly, in fact not using any padding at al.

I created a new version (coming soon) v0.8.

Testing your data in my laptop fails by GPU and also by CPU, so I tried in a bigger machine and I managed to get it working. This uses the upcoming 0.8 version. 10 iterations doesn't show significant border artifacts.

Some further advice.

No need to use padding, but I recommend that you PSF size is reduced as much as possible. There are too many zeros and make sure PSF is centered. The block algorithm works best with small PSF's. The PSF you provided does somekind of diagonal across the center, I don't know what kind of optics does that.

If you want to crop your PSF while keeping the data centered you can use the function below, which will pad or crop as necessary.

import RedLionfishDeconv.helperfunctions as rlh
rlh.change3DSizeTo(data, newshape)

deconvolution_RedFishLion_1 LMAP.ipynb.zip

pr4deepr commented 2 years ago

Thanks @perdigao1 for the detailed resonse and fixes..

In regard to the optics and psf, this is from a lattice lightsheet microscope where the images are acquired at an angle of 30 degrees. As the raw data is skewed, the PSF will also be skewed by the same angle. We perform deconvolution on the raw data before data reconstruction.

Thanks for the helper code for cropping data.
Am not sure how much I can crop this PSF by. As its a float32 image, there are really small values. For example, if I perform a maximum projection of my PSF and then plot it:

PSF maximum projection:

I was testing thresholds to figure out how much I can crop it by. Plan will be to get the maximum bounds of mask image in each axes and use that for cropping the PSF. However,

PSF binarised (threshold >0)

If I use 1e-5: PSF_binarised (threshold>0.00001)

Is there a minimum value or threshold for pixel values that I can safely ignore in the PSF, which won't make a noticeable difference when deconvolving the data? Hope this makes sense.

perdigao1 commented 2 years ago

Ah, you are doing acquisition at an angle. I suspected that.

As with the threshold. I checked the psf_1 data again. max =0.4180338 min= 0.0 sum = 126.2625

FYI, in the RL deconvolution, psf must be always normalised to the sum. This is done automatically in RedLionfish anyway.

My suggestion is that you threshold your image to psf/psf.max() > 1e-4.

To answer your question, of what threshold is safer for deciding how much to crop, I don't know. I never have done an in-depth analysis. The results would be certainly different to a certain precision, but it is difficult to decide what is acceptable.

In previous attempts I simply cropped the PSF to half dimension in each direction and it was working ok still. OTher option is to preserve the shorter axis and crop to half along the other 2 axis. Any smallification helps. Many of my colleagues halve the resolution of data before further processing, and at the end, resize back again. If you wish to do this, apply exactly the same resizes to data and psf.

I can see that your PSF is smaller along z-axis psf.shape=(93, 205, 205) . Maybe you could crop to (93,102,102). Strangely, your data is shorter in the y-axis data_shape=(834, 300, 2048). Sure the y is not z? This PSF is simulated, not experimental from beads data, correct? I hope this helps

pr4deepr commented 2 years ago

Thanks a lot.

Sure the y is not z?

The z of the raw data depends on the object being image, but the y remains fairly consistent. Here, the data is acquired at angle of 30 degrees and the raw data is reconstructed post acquisition, which involves a shearing operation, scaling and a rotation, so the z and y dimension numbers get "swapped" (loosely using this term). Exact function used: https://github.com/clEsperanto/pyclesperanto_prototype/blob/master/pyclesperanto_prototype/_tier8/_deskew_y.py

This PSF is simulated, not experimental from beads data, correct?

Yes, the PSF is experimental..

I'm linking a notebook from @VolkerH who made a very similar package as ours. He created a function for this: https://github.com/VolkerH/Lattice_Lightsheet_Deskew_Deconv/blob/master/examples/find_PSF_support.ipynb

I think its worth incorporating this into the rchange3DSizeTo helper function. What do you think ?

I may try a few different thresholds, but from what you've said cropping the PSF is definitely a way to go. I'll post what I've used and include the updated notebook here..

rosalindfranklininstitute / RedLionfish

large 3D arrays take long for deconvolution #12