QI2lab / merfish3d-analysis

3D MERFISH data processing
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Error with "setup_and_run_localization_automated" #1

Closed mabbasi6 closed 7 months ago

mabbasi6 commented 7 months ago

Hi, I tried using the "setup_and_run_localization_automated.py" file for "opm3/20240202_ECL_IMG_GEL2/processed_v2" data, and I have been getting this error:

PSF generated
Changing image scale from [1. 1. 1.] to [0.31  0.087 0.087]
model instantiated
starting deconvolution
Deconvolve:   0%|                                         | 0/1 [00:00<?, ?it/s]Runtime error: ret returned -4 at /home/bnorthan/code/i2k/clij/clij2-fft/native/clij2fft/clij2fft.cpp:878
platform 0 NVIDIA CUDA
device name 0 NVIDIA A100-SXM4-80GB MIG 1g.10gb
finished deconvolution                                                          
starting DoG filter
DoG filter:  39%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588                   | 12/31 [00:01<00:01, 10.70it/s]
Traceback (most recent call last):
  File "/data/bioprotean/repos/momo/napari-spot-detection/src/napari_spot_detection/_widget.py", line 951, in _compute_dog
    self._spots3d.run_DoG_filter()
  File "/data/bioprotean/repos/momo/spots3d/spots3d/SPOTS3D.py", line 523, in run_DoG_filter
    self._dog_data = _imageprocessing.DoG_filter(self._decon_data,
  File "/data/bioprotean/repos/momo/spots3d/spots3d/_imageprocessing.py", line 384, in DoG_filter
    filtered_image = dask_dog_filter.compute(scheduler='single-threaded')
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/dask/base.py", line 342, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/dask/base.py", line 628, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/data/bioprotean/repos/momo/spots3d/spots3d/_imageprocessing.py", line 251, in perform_DoG_cartesian
    image_hp = localize.filter_convolve(image_cp, kernel_small.astype(cp.float32))
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/localize_psf/localize.py", line 252, in filter_convolve
    imgs_filtered = convolve(imgs, kernel, mode="same")
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupyx/scipy/signal/_signaltools.py", line 178, in fftconvolve
    out = _st_core._freq_domain_conv(in1, in2, axes, shape, calc_fast_len=True)
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupyx/scipy/signal/_signaltools_core.py", line 165, in _freq_domain_conv
    sp1 = fftn(in1, fshape, axes=axes)
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupyx/scipy/fft/_fft.py", line 459, in rfftn
    return func(x, s, axes, norm, cufft.CUFFT_FORWARD, 'R2C',
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupy/fft/_fft.py", line 602, in _fftn
    a = _cook_shape(a, s, axes, value_type, order=order)
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupy/fft/_fft.py", line 58, in _cook_shape
    z = cupy.zeros(shape, a.dtype.char, order=order)
  File "/home/mabbasi6/.conda/envs/spots3d/lib/python3.10/site-packages/cupy/_creation/basic.py", line 211, in zeros
    a = cupy.ndarray(shape, dtype, order=order)
  File "cupy/_core/core.pyx", line 132, in cupy._core.core.ndarray.__new__
  File "cupy/_core/core.pyx", line 220, in cupy._core.core._ndarray_base._init
  File "cupy/cuda/memory.pyx", line 740, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1426, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1447, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1118, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1139, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 1384, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
  File "cupy/cuda/memory.pyx", line 1387, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 998,092,800 bytes (allocated so far: 918,245,888 bytes).

A runtime error is made when deconvolution is running, and this is the output of deconvolution:

Screenshot 2024-02-08 at 3 25 04 PM

Apparently, DoG also crashes no matter how much I increase the RAM (from 16 GB to 128 GB). It always says Out of memory allocating 998,092,800 bytes.

dpshepherd commented 7 months ago

The DoG filter is crashing because the deconvolution did not work. For some reason, the GPU you have does not have enough memory allocated. You can tell the deconvolution crashed because the output is not correct and you got a error, Runtime error: ret returned -4 at /home/bnorthan/code/i2k/clij/clij2-fft/native/clij2fft/clij2fft.cpp:878

I pushed a much more aggressive memory management strategy, which may take longer to run but should work. You'll need to pull the new changes using git pull and the reinstall using pip install -e .

The memory management currently assumes that there is 12 GB of GPU memory available for use. If that is more than you can assign on the cluster, then I can make the expected available GPU memory a parameter of spots3d and you can experiment until you find a setting that works.

dpshepherd commented 7 months ago

You'll also need to update both spots3d and napari-spot-detection to have the code use the more aggressive memory management. If you have local copies, you can following the same strategy as above.

If you pip installed from the internet, you'll need to do pip install -U spots3d@git+https://git@github.com/qi2lab/spots3d@main#egg=spots3d and same for napari-spot-detection.

mabbasi6 commented 7 months ago

This issue is resolved as well when I re-downloaded the repos. Thank you!