MouseLand / Kilosort

Fast spike sorting with drift correction for up to a thousand channels
https://kilosort.readthedocs.io/en/latest/
GNU General Public License v3.0
458 stars 240 forks source link

After setting artifact_threshold I keep getting the "0 sample array error": ValueError: Found array with 0 sample(s) (shape=(0, 61)) while a minimum of 1 is required by TruncatedSVD.BUG: <Please replace this text with a comprehensive title> #732

Closed DanEgert closed 2 months ago

DanEgert commented 3 months ago

Describe the issue:

I have an artifact in my recordings that keep getting clustered in. One way to prevent that is probably to use artifact_threshold. But when I set it, I am getting the above error.

Reproduce the bug:

'artifact_threshold': 1000, #np.inf, #was infinite

Error message:

File H:\Anaconda\envs\kilosort\lib\site-packages\sklearn\decomposition\_truncated_svd.py:229, in TruncatedSVD.fit_transform(self, X, y)
    212 @_fit_context(prefer_skip_nested_validation=True)
    213 def fit_transform(self, X, y=None):
    214     """Fit model to X and perform dimensionality reduction on X.
    215 
    216     Parameters
   (...)
    227         Reduced version of X. This will always be a dense array.
    228     """
--> 229     X = self._validate_data(X, accept_sparse=["csr", "csc"], ensure_min_features=2)
    230     random_state = check_random_state(self.random_state)
    232     if self.algorithm == "arpack":

File H:\Anaconda\envs\kilosort\lib\site-packages\sklearn\base.py:633, in BaseEstimator._validate_data(self, X, y, reset, validate_separately, cast_to_ndarray, **check_params)
    631         out = X, y
    632 elif not no_val_X and no_val_y:
--> 633     out = check_array(X, input_name="X", **check_params)
    634 elif no_val_X and not no_val_y:
    635     out = _check_y(y, **check_params)

File H:\Anaconda\envs\kilosort\lib\site-packages\sklearn\utils\validation.py:1072, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
   1070     n_samples = _num_samples(array)
   1071     if n_samples < ensure_min_samples:
-> 1072         raise ValueError(
   1073             "Found array with %d sample(s) (shape=%s) while a"
   1074             " minimum of %d is required%s."
   1075             % (n_samples, array.shape, ensure_min_samples, context)
   1076         )
   1078 if ensure_min_features > 0 and array.ndim == 2:
   1079     n_features = array.shape[1]

ValueError: Found array with 0 sample(s) (shape=(0, 61)) while a minimum of 1 is required by TruncatedSVD.

Version information:

v 4.0.13

jacobpennington commented 3 months ago

@DanEgert The current artifact_threshold implementation is sort of ham-handed and will be made smarter in the future. Currently it just sets any batch with at least one value exceeding that threshold to all zeroes. So, most likely what's happening is too many of your batches are being set to 0, and you would need to do your own preprocessing instead to get rid of the artifacts before running Kilosort.

You should be able to see if that's the case in the GUI. After you set artifact_threshold, reload the data, then scan through DataView in time. Anywhere that an artifact (and its batch) got removed, you'll see a big white block.