mehta-lab / waveorder

Wave optical models and inverse algorithms for label-agnostic imaging of density & orientation.
BSD 3-Clause "New" or "Revised" License
12 stars 3 forks source link

reconstruction with GPU seems is limited by GPU RAM #70

Closed mattersoflight closed 1 year ago

mattersoflight commented 2 years ago

I tried to deconvolve a single position and the 40GB GPU RAM wasn't sufficient. Perhaps there is a bug in version of waveOrder used by recOrder.

The path to config file: comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon_sequential.yml

To reproduce, do the following on our HPC nodes:

module load anaconda
module load comp_micro
conda activate recorder
recOrder.reconstruct --config phase_fluor_decon_sequential.yml

I see the following error log

(recorder) [shalin.mehta@gpu-a-002 2022_01_20_orgs_BF_GFP_63x_04NA]$ recOrder.reconstruct --config phase_fluor_decon_sequential.yml 
Reading Data...
Finished Reading Data (0.0 min)
Creating new zarr store at /hpc/projects/comp_micro/projects/HEK/2022_01_20_orgs_BF_GFP_63x_04NA/phase_fluor_decon/phase_fluor_optimization_pos47.zarr
Initializing Reconstructor...
Finished Initializing Reconstructor (5.62 min)
Initializing Reconstructor...
Finished Initializing Reconstructor (3.18 min)
Beginning Reconstruction...
Traceback (most recent call last):
  File "/home/shalin.mehta/.conda/envs/recorder/bin/recOrder.reconstruct", line 33, in <module>
    sys.exit(load_entry_point('recOrder', 'console_scripts', 'recOrder.reconstruct')())
  File "/home/shalin.mehta/code/recOrder/recOrder/scripts/run_pipeline.py", line 48, in main
    manager.run()
  File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/pipeline_manager.py", line 337, in run
    deconvolve2D, deconvolve3D = self.pipeline.deconvolve_volume(stokes)
  File "/home/shalin.mehta/code/recOrder/recOrder/pipelines/phase_from_bf_pipeline.py", line 136, in deconvolve_volume
    lambda_re=self.config.TV_reg_ph_3D, itr=self.config.itr_3D)
  File "/home/shalin.mehta/code/recOrder/recOrder/compute/qlipp_compute.py", line 308, in reconstruct_phase3D
    itr=itr, verbose=False)
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/waveorder_reconstructor.py", line 2370, in Phase_recon_3D
    f_real = Single_variable_Tikhonov_deconv_3D(S0_stack, H_eff, reg_re, use_gpu=self.use_gpu, gpu_id=self.gpu_id, autotune=autotune_re, verbose=verbose)
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 1069, in Single_variable_Tikhonov_deconv_3D
    f_real_f = compute_f_real_f(xp.log10(reg_re))
  File "/home/shalin.mehta/.conda/envs/recorder/lib/python3.7/site-packages/waveorder/util.py", line 961, in compute_f_real_f
    f_real_f = S0_stack_f * H_eff_conj / (H_eff_abs_square + reg_coeff)
  File "cupy/core/core.pyx", line 1045, in cupy.core.core.ndarray.__truediv__
  File "cupy/core/_kernel.pyx", line 1063, in cupy.core._kernel.ufunc.__call__
  File "cupy/core/_kernel.pyx", line 565, in cupy.core._kernel._get_out_args
  File "cupy/core/core.pyx", line 2380, in cupy.core.core._ndarray_init
  File "cupy/core/core.pyx", line 151, in cupy.core.core.ndarray._init_fast
  File "cupy/cuda/memory.pyx", line 578, in cupy.cuda.memory.alloc
  File "cupy/cuda/memory.pyx", line 1250, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 1271, in cupy.cuda.memory.MemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 939, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc
  File "cupy/cuda/memory.pyx", line 959, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc
  File "cupy/cuda/memory.pyx", line 1210, in cupy.cuda.memory.SingleDeviceMemoryPool._try_malloc
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 7,314,866,176 bytes (allocated so far: 36,674,994,688 bytes).