Open Eddymorphling opened 1 month ago
i don't have one directly using pycudadecon, but in general, you'll want to use dask map_blocks or map_overlap. As @volkerh mentioned in https://github.com/tlambert03/pycudadecon/issues/13#issuecomment-653710403, he wrote a short tutorial on breaking up large deconvolution problems into tiles. (there he used flowdec, but the split/combine principle is the same)
Great, you are the best! Thanks.
Hi again @tlambert03 , The CLI help command does not seems to work. Is it cudaDeconv --help
?
no v, just cudaDecon
... see the C++ library for the cli: https://github.com/scopetools/cudaDecon
oh! my docs are wrong aren't they! Sorry about that... will update
Yeah, I was following the docs. No worries. Thank you.
Sorry to bother you again @tlambert03. Testing a set of images using the CLI command line and it seems to be stuck at the step without any progress. Output included below. Any suggestions why? Pytorch sees my GPU.
# of tiles: 3. # of overlaps: 2. Tile size: 1024 Y pix. Overlap: 213 Y pix.
Tile: 1 out of 3.
raw_image size : 2478 x 1024 x 1
Here is my CLI
radialft /home/psf_4x.tif /home/psf_4x_OTF.tif --fixorigin 10 --wavelength 525 --na 0.35 --nimm 1.56 --xyres 3 --zres 3 --nocleanup
cudaDecon /data/z-slices fused --otf-file /home/psf_4x_OTF.tif --wavelength 525 --tile 1024
not immediately sure, but let's first determine whether it has to do with your data specifically. Try the test data available here: https://github.com/tlambert03/pycudadecon/tree/main/tests/test_data
use something like
radialft.exe .\test_data\psf.tif .\test_data\otf.tif --nocleanup --fixorigin 10
cudaDecon.exe -z 0.3 -i 10 -D 31.5 .\test_data\im_raw .\test_data\otf.tif
Yes that worked with your test dataset. Not sure what's not compatible with my input datasets. Here is some background.
I kept it simple this time and here is what I ran. But gets stuck in the step I mentioned above.
radialft /home/scratch/pycudadecon/psf_4x.tif /home/scratch/pycudadecon/psf_4x_OTF.tif --fixorigin 10 --wavelength 525 --na 0.35 --nimm 1.56 --xyres 3 --zres 3 --nocleanup
cudaDecon -i 10 /home/scratch/pycudadecon/z-slices fused --otf-file /home/scratch/pycudadecon/psf_4x_OTF.tif
Ah I figured it out. When the documentation said the input folder can be tiff files, what I assumed is they can be 2D tiff files generated from a 3D stack, not multiple 3D stacks. My bad. Running it on a 3D stack works now. Thanks Talley.
great!
Hi again @tlambert03 I have been trying to setup a workflow that uses dask
and pycudadecon
for just 3D gaussian light sheet datasets but have been facing trouble. I adapted this example which I believe was written by you? Not sure.. I removed the deskew function and just told it to crop and deconvolve as they are 3D (xyz) gaussian LS datasets.
I adapted what I found and tried to run dask+pycudadecon+GPU on our cluster. Here is what I came up with
import napari
import pycudadecon
from functools import partial
from skimage import io
from dask_image.imread import imread
import dask.array as da
import tifffile
from dask.distributed import Client
def main():
client = Client() # This will use all available cores by default
print(client)
# Load stacks with dask_image, and PSF with skimage
# Each file in the folder is a single Z slice from the 3D volume
stack = imread("/data/z-slices/*.tif")
psf = io.imread("/home/psf_4x.tif")
# Print the chunk sizes of the stack
print("Chunk sizes:", stack.chunks)
# Prepare some functions that accept a numpy array and return a processed array
def crop(array):
# Simple cropping function
return array[:, 2:, 10:-20, :500]
# https://docs.python.org/3.8/library/functools.html#functools.partial
deconv = partial(pycudadecon.decon, psf=psf, background=10)
# Map and chain those functions across all dask blocks
deconvolved = stack.map_blocks(deconv, dtype="float16")
cropped = deconvolved.map_blocks(crop, dtype="float16")
output_path = "/home/scratch/decon.tif"
tifffile.imwrite(output_path, deconvolved.compute())
# Put the resulting dask array into napari.
# (Don't forget the contrast limits and multiscale==False !)
v = napari.view_image(
cropped,
contrast_limits=[90, 1500],
multiscale=False,
ndisplay=3,
scale=(3, 1, 1),
)
napari.run()
if __name__ == '__main__':
main()
My input folder is tiff files series where each file is a single slice, while my PSF is a 3D volume. When I run this, it does not save any deconvolved image nor open the image in napari. Below is what I see in the terminal, Would you happen to know what's going on? Thank you.
<Client: 'tcp://127.0.0.1:45707' processes=8 threads=64, memory=503.62 GiB>
Chunk sizes: ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), (5942,), (4339,))
New chunk sizes: ((5, 5, 3), (1000, 1000, 1000, 1000, 1000, 942), (1000, 1000, 1000, 1000, 339))
callednx=74, ny=74, nz=74
Center of mass is (37.500, 37.500, 38.100)
Background is 0.000
In radialft()
looking, quick question: where is this getting printed?
New chunk sizes: ((5, 5, 3), (1000, 1000, 1000, 1000, 1000, 942), (1000, 1000, 1000, 1000, 339))
as for the failure, it looks like something is going wrong at the moment of OTF generation (in radialft)...
I would simplify this for now, leaving out napari, and simply calling cropped.compute()
, also, try putting prints and/or breakpoints in the various functions to see exactly what dimensions/data are getting passed.
Try also leaving out dask and all that complexity too. If you're trying to build up to something that does block-wise deconvolution and stitches them all back together, then first just try to manually deconvolve one of those blocks (i.e. just literally crop your data down to a smaller subvolume that would match the chunk size you're trying to create). Then just deconvolve that single chunk and see how it goes.
If you're failing on the radialft part, then it suggests there might be a problem with your psf file... so I would suggest manually going through what the decon
function does, and see where it fails. for example maybe first try to make a TemporaryOTF
from your psf like this line in the decon function does. In general, break it down to much smaller steps, before building back up to the full map-blocks-chunked decon in napari
Thanks! This was an option that I included to rechunk the dataset. It get printed before the decon functions starts. Tried it without it with this fails.
have you confirmed that make_otf
works on your psf_4x.tif
file? https://www.talleylambert.com/pycudadecon/otf.html#pycudadecon.make_otf
Thanks! This was an option that I included to rechunk the dataset. It get printed before the decon functions starts. Tried it without it with this fails.
yeah, just didn't see the actual rechunking... which I do expect to be an important step. I would probably try to leave the full z in there (i.e. don't chunk down to (5, 5, 3)
on the z axis, but rather (13,)
... that's a small z as it is, and the deconvolution probably won't work well with something like only 3-5 planes. try (13, 512, 512)
... and then start increasing xy later.
but first do it manually, and also, i wouldn't expect that bit to have anything to do with the failed otf generation, so test the the make_otf step independently first
have you confirmed that
make_otf
works on yourpsf_4x.tif
file?
Yes, this works when running independently. But not when I use my code.
yeah, just didn't see the actual rechunking... which I do expect to be an important step. I would probably try to leave the full z in there (i.e. don't chunk down to
(5, 5, 3)
on the z axis, but rather(13,)
... that's a small z as it is, and the deconvolution probably won't work well with something like only 3-5 planes. try(13, 512, 512)
... and then start increasing xy later.but first do it manually, and also, i wouldn't expect that bit to have anything to do with the failed otf generation, so test the the make_otf step independently first
I initially used 13 slices just to test. But now I also tried around 150 z-slices, still fails. My PSF is theoretical for testing purposes, its 64x64x64 that I generated with PSF generator.
apologies that this isn't easier to debug. I definitely do see this is something that pycudadecon could make easier for you. There are indeed some sneaky "gotchas" in here, with failures caused by subtle things that I don't always entirely understand. It can unfortunately sometimes take a little trial and error to get it all working on a new setup, and it's annoying when it fails without any useful info.
I know it's a very large file, but would you be willing to send me a dropbox link to the data and your generated PSF?
Sure, thing. Thanks for helping me out. The 3D data can be big depending on the number of z-slices. How many would you like?
hmm, how many are there total? what is the full size of the data?
Its around 1500 slices, which comes to around 200gb. Shall I send you 50 slices?
maybe 100? if you'd prefer not to link here try talley.lambert@gmail.com
Got it. Its on the way, thanks again!
thank you! may take me a few days, but curious to have a look
Hi Talley, Wondering if you got the chance to troubleshoot?
i think your issues with hanging at In radialft()
are related to an idiosyncracy of dask map blocks. Your function needs to be able to handle 0-dimensional arrays (see https://docs.dask.org/en/stable/generated/dask.array.map_blocks.html for details).
You can solve this either by using the meta
parameter that they describe there, or by making a function that can properly handle zero dimensional arrays like this:
from pycudadecon import decon
import tifffile as tf
from dask_image.imread import imread
from dask.distributed import Client
import numpy as np
def main():
# important to limit to a single thread & worker, otherwise the GPU will be shared
_ = Client(threads_per_worker=1, n_workers=1)
root = "C:/Users/talley/Downloads/Tiffs-20240823T145014Z-001/Tiffs/"
psf = root + "psf_4x.tif"
stack = imread(root + "tiffs/*.tif")
stack = stack.rechunk((101, 512, 512))
def decon_chunk(chunk, block_id):
# handle empty array
if chunk.size == 0:
return np.array([]).astype("float32")
print("decon_chunk", chunk.shape, "block_id", block_id)
return decon(chunk, psf=psf, background=100)
deconvolved = stack.map_blocks(decon_chunk, dtype="float32")
# small portion of the image to demonstrate
result = deconvolved[:, 2048:3072, 2048:3072].compute()
tf.imwrite(root + "decon.tif", result)
if __name__ == "__main__":
main()
the other issue you will run into is that you'll need to restrict dask to a single thread & worker, otherwise your GPU will be overwhelmed.
Hi @tlambert03 , this works! Thank you. What would I change here if I want to perform dask decon with chunks and overlap instead of running it on a cropped region? The goal is to do this for the large 3D dataset. This is what I have so far. Most of it is adapted from Robert Haase's Clij2 decon pipeline using dask arrays.
from pycudadecon import decon
import tifffile as tf
from dask_image.imread import imread
from dask.distributed import Client
import dask.array as da
import numpy as np
from skimage import img_as_uint
import os
def main():
# Important to limit to a single thread & worker, otherwise the GPU will be shared
_ = Client(threads_per_worker=1, n_workers=1)
root = "/data/flowdec/z-slices/"
psf_path = "/data/flowdec/psf_4x.tif"
stack = imread(root + "*.tif")
num_x_chunks = 5
num_y_chunks = 5
num_z_chunks = 20
z_chunk_size = int(stack.shape[0] / num_z_chunks)
y_chunk_size = int(stack.shape[1] / num_y_chunks)
x_chunk_size = int(stack.shape[2] / num_x_chunks)
print('chunks', z_chunk_size, y_chunk_size, x_chunk_size)
# Create Dask image with specified chunk sizes
stack = stack.rechunk((z_chunk_size, y_chunk_size, x_chunk_size))
# Define the overlap size
overlap = 50
def decon_chunk(chunk, block_info=None):
# Handle empty array
if chunk.size == 0:
return np.array([]).astype("float32")
print("decon_chunk", chunk.shape, "block_info", block_info)
return decon(chunk, psf=psf_path, background=100, n_iters=50)
# Apply deconvolution with overlap
deconvolved = stack.map_overlap(decon_chunk, depth={0: 0, 1: overlap, 2: overlap}, boundary='reflect', dtype="float32")
# Compute the entire deconvolved stack (this is when data is read into memory)
result = deconvolved.compute()
# Normalize the result to the range [0, 1]
result_normalized = (result - np.min(result)) / (np.max(result) - np.min(result))
result_16bit = img_as_uint(result_normalized)
tf.imwrite(root + "decon.tif", result_16bit, compression='zlib')
# Convert the deconvolved stack to individual TIFF files
# Extract the basename (file name) and first 10 characters of it
basename = os.path.basename(root + "decon.tif")
basename_first_11_chars = basename[:11]
tiffs_folder = os.path.join(root, 'tiffs_decon')
if not os.path.exists(tiffs_folder):
os.makedirs(tiffs_folder)
img_stack = tf.imread(root + "decon.tif")
# Loop through each slice, create and save new tiffs
for i in range(img_stack.shape[0]):
new_image_path = os.path.join(tiffs_folder, f"{basename_first_11_chars}_{str(i).zfill(4)}.tif")
tf.imwrite(new_image_path, img_stack[i], compression='LZW')
print(f"Deconvolution and TIFF conversion completed. Individual TIFF files saved in {tiffs_folder}")
if __name__ == "__main__":
main()
Yeah, using map_overlap is the key. That's what you'll need to do this properly without having bad edge artifacts everywhere. Do you have a specific question about it?
I do see a few artefacts in the final dataset. What would you suggest to reduce such artefacts?
The main parameters you will want to play with will be the chunk size and the overlap size (the "depth" parameter to map_overlap). The larger the depth, the less edge artifact you should see, but the more "wasted" work you're doing. Also read up on the other parameters and just try things, it will take some experimenting.
https://docs.dask.org/en/stable/generated/dask.array.overlap.map_overlap.html
Hi Talley, I am trying to use
pycudadecon
on large 3D datasets generated with our gaussian ligthsheet. As you may be aware, the datasets are large and loading the entire volume during decon leads to memory issues.I have acquired PSFs but wanted to reach out about your suggestions for doing this using dask-arrays for efficient GPU/CPU memory usage when running decon jobs using GPU on my cluster. Any notebooks/scripts that you may have that uses pycudadecon+dask to help me get started? Much appreciated. Thank you.