flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
639 stars 370 forks source link

Save memory by returning a sparse array from extract_binary_masks_from_structural_channel #1395

Open ethanbb opened 2 months ago

ethanbb commented 2 months ago

Description

The docstring for extract_binary_masks_from_structural_channel says that it returns a sparse column format matrix, but it actually just returns an ndarray. I was running out of memory when trying to use this to generate a mask to input to CNMF. This changes the function to return a scipy.sparse.csc_array instead.

Type of change

Has your PR been tested?

caimanmanager test and caimanmanager demotest pass. I don't think it's a good idea to add a test for memory usage, but here is what I see for my use case:

Before:

>>> tracemalloc.start()
>>> Ain, _ = rois.extract_binary_masks_from_structural_channel(mean_img, gSig=9)
>>> curr_mem, peak_mem = tracemalloc.get_traced_memory()
>>> curr_mem
1822996704

After:

>>> tracemalloc.start()
>>> Ain, _ = rois.extract_binary_masks_from_structural_channel(mean_img, gSig=9)
>>> curr_mem, peak_mem = tracemalloc.get_traced_memory()
>>> curr_mem
971327
pgunn commented 2 months ago

@kushalkolar This looks good to me; is there any chance this might break anything with Mesmerize?

ethanbb commented 2 months ago

Hold off, it seems to break something (also getting this error in demo_seeded_CNMF.ipynb):

Traceback (most recent call last):
  File "/u/ethan/mesmerize-core/mesmerize_core/algorithms/cnmf.py", line 96, in run_algo
    cnm = cnm.fit(images)
          ^^^^^^^^^^^^^^^
  File "/u/ethan/CaImAn/caiman/source_extraction/cnmf/cnmf.py", line 523, in fit
    self.update_spatial(Yr, use_init=True)
  File "/u/ethan/CaImAn/caiman/source_extraction/cnmf/cnmf.py", line 915, in update_spatial
    update_spatial_components(Y, C=self.estimates.C, f=self.estimates.f, A_in=self.estimates.A,
  File "/u/ethan/CaImAn/caiman/source_extraction/cnmf/spatial.py", line 173, in update_spatial_components
    ind2_, nr, C, f, b_, A_in = computing_indicator(
                                ^^^^^^^^^^^^^^^^^^^^
  File "/u/ethan/CaImAn/caiman/source_extraction/cnmf/spatial.py", line 1066, in computing_indicator
    ind2_ = [np.hstack((np.where(iid_)[0], nr + np.arange(f.shape[0])))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/u/ethan/CaImAn/caiman/source_extraction/cnmf/spatial.py", line 1067, in <listcomp>
    if np.size(np.where(iid_)[0]) > 0 else [] for iid_ in dist_indicator]
               ^^^^^^^^^^^^^^
  File "/u/ethan/.conda/envs/mescore/lib/python3.11/site-packages/scipy/sparse/_base.py", line 396, in __bool__
    raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().
ethanbb commented 2 months ago

OK, this should fix it, but probably would be a good idea for someone else to check that notebook demo.