tlambert03 / pycudadecon

Python wrapper for cudaDecon - GPU accelerated 3D deconvolution for microscopy
http://www.talleylambert.com/pycudadecon/
MIT License
59 stars 12 forks source link

Illegal memory access / crash when dup_rev_z=True #16

Open iguptasn opened 3 years ago

iguptasn commented 3 years ago

Hi. When attempting to use the axial ringing correction feature, it appears that our Python interpreter is dying following a CUDA memory access violation error. Do you know what might be going on here? Can also supply example data, if helpful. Thanks!

Minimal example:

import pycudadecon
import numpy as np
import tifffile

test_data = tifffile.imread("./test_tiffstack.tif")

otf_path = pycudadecon.make_otf(
    psf = './test_psf.tif',
    dzpsf = 0.1,
    dxpsf = 0.129,
    wavelength = 510, 
    na = 1.0, 
    nimm = 1.33,
)

with pycudadecon.RLContext(data.shape, otf_path) as ctx:
    result = pycudadecon.rl_decon(
        test_data, 
        output_shape=ctx.out_shape, 
        dxdata=0.129,
        dzdata=1,
        background= 0,
        n_iters= 25,
        nz_blend= 1,
        dup_rev_z=True,
    )

And the error message is -

output nz=50
X_k allocated.               200MB    9899MB free Pinning Host RAM.  Copy raw.data to X_k HostToDevice.  Done.  
Done
rawGPUbuf allocated.         400MB    9299MB free
X_kminus1 allocated.         400MB    8899MB free
Y_k allocated.               400MB    8499MB free
CC allocated.                400MB    8099MB free
G_kminus1 allocated.         400MB    7699MB free
G_kminus2 allocated.         400MB    7299MB free
fftGPUbuf allocated.         400MB    6897MB free
Iteration 0. 
Iteration 1. 
Iteration 2. CPUBuffer cudaMemcpy failed. Error code: 77
an illegal memory access was encountered
terminate called after throwing an instance of 'std::runtime_error'
  what():  cudaMemcpy failed.
Aborted
tlambert03 commented 3 years ago

Sorry for the delayed response. It's unfortunately become a lot harder for me test this library due to hardware changes on my dev setup. @linshaova may be able to provide some guidance... but it will take me a little time to be able to get to this.