VolkerH / Lattice_Lightsheet_Deskew_Deconv

Open-source, GPU accelerated code for deskewing and deconvolving lattice light sheet data
Other
23 stars 3 forks source link

openCL error #14

Closed VolkerH closed 5 years ago

VolkerH commented 5 years ago
---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-7-6b02bb31308f> in <module>
----> 1 ep.process_stack_subfolder('Stack_10_drp1_dendra2skl_mScarletdrp1')

~/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py in process_stack_subfolder(self, stack_name)
    265             if self.do_deconv:
    266                 # determine wavelength for file and pick corresponding PSF
--> 267                 self.process_file(pathlib.Path(row.file), deskew_func, rotate_func, deconv_functions[wavelength])
    268             else:
    269                 self.process_file(pathlib.Path(row.file), deskew_func, rotate_func)

~/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py in process_file(self, infile, deskew_func, rotate_func, deconv_func)
    176                     self.create_MIP(deconv_deskewed.astype(self.output_dtype), outfiles["deconv/deskew/MIP"])
    177             if self.do_deconv_rotate:
--> 178                 deconv_rotated = rotate_func(deconv_raw)
    179                 write_tiff_createfolder(outfiles["deconv/rotate"], deconv_rotated.astype(self.output_dtype))
    180                 if self.do_MIP:

~/Lattice_Lightsheet_Deskew_Deconv/Python/gputools_wrapper.py in affine_transform_gputools(input, matrix, offset, output_shape, output, order, mode, cval, prefilter)
     58     with warnings.catch_warnings():
     59         warnings.simplefilter("ignore")
---> 60         result = gputools.affine(data=input, mat=matrix, mode=mode, interpolation=interpolation)
     61 
     62     if needs_crop:

~/.conda/pkgs/cache/su62_scratch/volker_conda/newllsm/lib/python3.6/site-packages/gputools/transforms/transformations.py in affine(data, mat, mode, interpolation)
     81     prog.run_kernel("affine3",
     82                     data.shape[::-1], None,
---> 83                     d_im, res_g.data, mat_inv_g.data)
     84 
     85     return res_g.get()

~/.conda/pkgs/cache/su62_scratch/volker_conda/newllsm/lib/python3.6/site-packages/gputools/core/oclprogram.py in run_kernel(self, name, global_size, local_size, *args, **kwargs)
     44             self._kernel_dict[name] = getattr(self,name)
     45 
---> 46         self._kernel_dict[name](self._dev.queue,global_size, local_size,*args,**kwargs)
     47 
     48 

~/.conda/pkgs/cache/su62_scratch/volker_conda/newllsm/lib/python3.6/site-packages/pyopencl/__init__.py in kernel_call(self, queue, global_size, local_size, *args, **kwargs)
    813         # __call__ can't be overridden directly, so we need this
    814         # trampoline hack.
--> 815         return self._enqueue(self, queue, global_size, local_size, *args, **kwargs)
    816 
    817     def kernel_capture_call(self, filename, queue, global_size, local_size,

<generated code> in enqueue_knl_affine3(self, queue, global_size, local_size, arg0, arg1, arg2, global_offset, g_times_l, wait_for)

MemoryError: clEnqueueNDRangeKernel failed: MEM_OBJECT_ALLOCATION_FAILURE
VolkerH commented 5 years ago

this appeared after I changed a dtype to np.int. With dtype back to np.uint16 everything works. I assume that the standard integer type is not supported on my GPU.

VolkerH commented 5 years ago

I ran into this error again. I can consistenly reproduce it for Stack_7_drp1_dendra2skl_mScarlet_drp1_test_6_fast. Other datasets process fine, so it may be some corruption of the input data.

/home/vhil0002/Github/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py:207: UserWarning: Fix write_func stuff to include compression and units
  warnings.warn("Fix write_func stuff to include compression and units")
/home/vhil0002/Github/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py:262: UserWarning: more than one PSF found. Taking first one
  warnings.warn(f"more than one PSF found. Taking first one")
  0%|                                                                                        | 0/40 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "batch_run.py", line 26, in <module>
    ep.process_stack_subfolder(subfolder)
  File "/home/vhil0002/Github/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py", line 289, in process_stack_subfolder
    self.process_file(pathlib.Path(row.file), deskew_func, rotate_func, deconv_functions[wavelength])
  File "/home/vhil0002/Github/Lattice_Lightsheet_Deskew_Deconv/Python/process_llsm_experiment.py", line 198, in process_file
    deconv_rotated = rotate_func(deconv_raw)
  File "/home/vhil0002/Github/Lattice_Lightsheet_Deskew_Deconv/Python/gputools_wrapper.py", line 76, in affine_transform_gputools
    result = gputools.affine(data=input_data, mat=matrix, mode=mode, interpolation=interpolation)
  File "/home/vhil0002/anaconda3/envs/newllsm/lib/python3.6/site-packages/gputools/transforms/transformations.py", line 83, in affine
    d_im, res_g.data, mat_inv_g.data)
  File "/home/vhil0002/anaconda3/envs/newllsm/lib/python3.6/site-packages/gputools/core/oclprogram.py", line 46, in run_kernel
    self._kernel_dict[name](self._dev.queue,global_size, local_size,*args,**kwargs)
  File "/home/vhil0002/anaconda3/envs/newllsm/lib/python3.6/site-packages/pyopencl/__init__.py", line 815, in kernel_call
    return self._enqueue(self, queue, global_size, local_size, *args, **kwargs)
  File "<generated code>", line 69, in enqueue_knl_affine3
pyopencl._cl.MemoryError: clEnqueueNDRangeKernel failed: MEM_OBJECT_ALLOCATION_FAILURE
VolkerH commented 5 years ago

Fairly sure this is happening because tensorflow grabs almost all GPU memory and leaves very little for other processes. Depending on the volume of individual stacks there may be just enough GPU memory left to perform the affine transforms using gputools, but not always. Will have to try limiting how much GPU memory tensorflow can grab: https://stackoverflow.com/questions/34199233/how-to-prevent-tensorflow-from-allocating-the-totality-of-a-gpu-memory

using allow_growth might be useful in order not so set a limit automatically https://www.tensorflow.org/guide/using_gpu This may then also enable several worker processes to use the GPU.

eric-czech commented 5 years ago

fwiw: GPU memory exhaustion errors has a snippet for setting the memory used, and the "allow_growth" option is a good way to avoid tensorflow's very greedy default GPU memory preallocation behavior.

VolkerH commented 5 years ago

Thanks. I had already implemented the allow_growth fix while on the train home a couple of hours ago. Was just about to test it when I saw your comment. I wasn't aware of the issue you referenced, basically the identical problem ... will put a watch on the flowdec repo.

VolkerH commented 5 years ago

that fixed it, only need to integrate this nicely