flatironinstitute / CaImAn

Computational toolbox for large scale Calcium Imaging Analysis, including movie handling, motion correction, source extraction, spike deconvolution and result visualization.
https://caiman.readthedocs.io
GNU General Public License v2.0
639 stars 370 forks source link

Negative values in data passed to NMF #959

Closed fatihdinc closed 2 years ago

fatihdinc commented 2 years ago

For better support, please use the template below to submit your issue. When your issue gets resolved please remember to close it.

Sometimes errors while running CNMF occur during parallel processing which prevents the log to provide a meaningful error message. Please reproduce your error with setting dview=None.

If you need to upgrade CaImAn follow the instructions given in the documentation.

*You can get the CaImAn version by creating a params object and then typing params.data['caiman_version']. If the field doesn't exist, type N/A and consider upgrading)

I am using a modified version of the demo_pipeline script for running a 2p movie of my own, which is 800x800x10000. I get this error when I try to run the movie for some patching values, but not all. This seems like a general issue with one of the functions imported from scikit-learn. Thank you!


ValueError Traceback (most recent call last) Input In [6], in 2 tic = time.time() 3 cnm = cnmf.CNMF(n_processes, params=opts, dview=dview) ----> 4 cnm.fit_file(motion_correct=False) 5 toc = time.time() 6 print(toc - tic)

File ~\miniconda3\envs\caiman\lib\site-packages\caiman\source_extraction\cnmf\cnmf.py:373, in CNMF.fit_file(self, motion_correct, indices, include_eval) 371 self.mmap_file = fname_new 372 if not include_eval: --> 373 return self.fit(images, indices=indices) 375 fit_cnm = self.fit(images, indices=indices) 376 Cn = summary_images.local_correlations(images[::max(T//1000, 1)], swap_dim=False)

File ~\miniconda3\envs\caiman\lib\site-packages\caiman\source_extraction\cnmf\cnmf.py:601, in CNMF.fit(self, images, indices) 596 if not isinstance(images, np.memmap): 597 raise Exception( 598 'You need to provide a memory mapped file as input if you use patches!!') 600 self.estimates.A, self.estimates.C, self.estimates.YrA, self.estimates.b, self.estimates.f, \ --> 601 self.estimates.sn, self.estimates.optional_outputs = run_CNMF_patches( 602 images.filename, self.dims + (T,), self.params, 603 dview=self.dview, memory_fact=self.params.get('patch', 'memory_fact'), 604 gnb=self.params.get('init', 'nb'), border_pix=self.params.get('patch', 'border_pix'), 605 low_rank_background=self.params.get('patch', 'low_rank_background'), 606 del_duplicates=self.params.get('patch', 'del_duplicates'), 607 indices=indices) 609 self.estimates.bl, self.estimates.c1, self.estimates.g, self.estimates.neurons_sn = None, None, None, None 610 logging.info("merging")

File ~\miniconda3\envs\caiman\lib\site-packages\caiman\source_extraction\cnmf\map_reduce.py:441, in run_CNMF_patches(file_name, shape, params, gnb, dview, memory_fact, border_pix, low_rank_background, del_duplicates, indices) 439 nan_components = np.any(np.isnan(F_tot), axis=1) 440 F_tot = F_tot[~nancomponents, :] --> 441 = mdl.fit_transform(F_tot).T 442 Bm = Bm[:, ~nancomponents] 443 f = mdl.components.squeeze()

File ~\miniconda3\envs\caiman\lib\site-packages\sklearn\decomposition_nmf.py:1538, in NMF.fit_transform(self, X, y, W, H) 1533 X = self._validate_data( 1534 X, accept_sparse=("csr", "csc"), dtype=[np.float64, np.float32] 1535 ) 1537 with config_context(assume_finite=True): -> 1538 W, H, n_iter = self._fit_transform(X, W=W, H=H) 1540 self.reconstructionerr = _beta_divergence( 1541 X, W, H, self._beta_loss, square_root=True 1542 ) 1544 self.ncomponents = H.shape[0]

File ~\miniconda3\envs\caiman\lib\site-packages\sklearn\decomposition_nmf.py:1584, in NMF._fit_transform(self, X, y, W, H, update_H) 1550 def _fit_transform(self, X, y=None, W=None, H=None, update_H=True): 1551 """Learn a NMF model for the data X and returns the transformed data. 1552 1553 Parameters (...) 1582 Actual number of iterations. 1583 """ -> 1584 check_non_negative(X, "NMF (input X)") 1586 # check parameters 1587 self._check_params(X)

File ~\miniconda3\envs\caiman\lib\site-packages\sklearn\utils\validation.py:1249, in check_non_negative(X, whom) 1246 X_min = X.min() 1248 if X_min < 0: -> 1249 raise ValueError("Negative values in data passed to %s" % whom)

ValueError: Negative values in data passed to NMF (input X)

fatihdinc commented 2 years ago

As an update: This problem persists even if I shut down the parallel processing. So, this is likely not a memory issue. And I reproduced the same issue in a Mac. The problem seems to be that NMF requires the input to be non-negative, which is not necessarily true for the calcium movies after preprocessing.

pgunn commented 2 years ago

Are the values negative in the original data? I would be surprised if the negative values come from motion correction.

If so, maybe doing some value rescaling on the way in would solve the problem?

fatihdinc commented 2 years ago

Yes I can confirm that there is no negative value in the data passed to the algorithm, also I am not running any motion correction. At this point, I have no idea what is causing this, but overall I would propose CVXPY as a versatile alternative for NMF, as scikit-learn imposes unnecessarily strict conditions.

pgunn commented 2 years ago

Looking at the code handling this, it looks like it does a simple check for negative values and I don't think there's a way to trigger this exception without that happening. If it helps, you could probably edit that exception to display the value of X_min to verify (it may be negative but very close to zero?).

fatihdinc commented 2 years ago

As I previously alluded to, this is a problem that happens in certain patch size values, but not all. If the input movie had negative values, this would persist for all patch sizes (And I did check, it does not have negative values). Let me take a look at the code and see why that might be happening, I will update here if I find the reason.

j-friedrich commented 2 years ago

Fixed by making the data passed to NMF in line 441 of map_reduce.py nonnegativ.