SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
533 stars 187 forks source link

Error when saving/caching with `resample` #951

Closed mfvd closed 1 year ago

mfvd commented 2 years ago

Hi! I'm getting en error when trying to cache a down sampled recording but not on a recording that was only filtered/re-referenced.

When checking the number of samples of the down sampled recrodign with rec_resample.get_num_samples() I get the correct value of samples. However, when I use the rec_resample.get_traces method and check the dimensions I get 0 samples x N_channels.

Code I'm using

new_rate = 2500
rec_resample = pre.resample(rec_probe,
                            resample_rate = new_rate) # in Hz

rec_resample = rec_resample.save(format='binary',
                          folder="lfp/preprocessed",
                          n_jobs=32,
                          chunk_size=5000,
                          progress_bar=True)

ERROR

/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/job_tools.py:18: DeprecationWarning: invalid escape sequence \I
  _shared_job_kwargs_doc = """**job_kwargs: keyword arguments for parallel processing:

---------------------------------------------------------------------------
_RemoteTraceback                          Traceback (most recent call last)
_RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/process.py", line 243, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/process.py", line 202, in _process_chunk
    return [fn(*args) for args in chunk]
  File "/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/process.py", line 202, in <listcomp>
    return [fn(*args) for args in chunk]
  File "/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/job_tools.py", line 337, in function_wrapper
    return _func(segment_index, start_frame, end_frame, _worker_ctx)
  File "/home/neuroimaging/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/core_tools.py", line 218, in _write_binary_chunk
    rec_memmap[start_frame:end_frame, :] = traces
ValueError: could not broadcast input array from shape (0,62) into shape (1677,62)
"""

The above exception was the direct cause of the following exception:

ValueError                                Traceback (most recent call last)
Input In [4], in <cell line: 13>()
      3 rec_resample = pre.resample(rec_probe,
      4                             resample_rate = resample_rate) # in Hz
      7 ###rec_cmr = st.common_reference(rec_resample,
      8 ###                              reference='global',
      9 ###                              operator='median')
     10 
     11 
     12 #### used to make accessing data faster afterwards.
---> 13 rec_resample = rec_resample.save(format='binary',
     14                           folder="lfp/preprocessed",
     15                           n_jobs=32,
     16                           chunk_size=5000,
     17                           progress_bar=True)

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/base.py:615, in BaseExtractor.save(self, **kwargs)
    613     loaded_extractor = self.save_to_zarr(**kwargs)
    614 else:
--> 615     loaded_extractor = self.save_to_folder(**kwargs)
    616 return loaded_extractor

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/base.py:694, in BaseExtractor.save_to_folder(self, name, folder, verbose, **save_kwargs)
    691 self.save_metadata_to_folder(folder)
    693 # save data (done the subclass)
--> 694 cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
    696 # copy properties/
    697 self.copy_metadata(cached)

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/baserecording.py:224, in BaseRecording._save(self, format, **save_kwargs)
    221     dtype = self.get_dtype()
    223 job_kwargs = {k: save_kwargs[k] for k in job_keys if k in save_kwargs}
--> 224 write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
    226 from .binaryrecordingextractor import BinaryRecordingExtractor
    227 cached = BinaryRecordingExtractor(file_paths=file_paths, sampling_frequency=self.get_sampling_frequency(),
    228                                   num_chan=self.get_num_channels(), dtype=dtype,
    229                                   t_starts=t_starts, channel_ids=self.get_channel_ids(), time_axis=0,
    230                                   file_offset=0, gain_to_uV=self.get_channel_gains(),
    231                                   offset_to_uV=self.get_channel_offsets())

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/core_tools.py:285, in write_binary_recording(recording, file_paths, dtype, add_file_extension, verbose, byte_offset, auto_cast_uint, **job_kwargs)
    282     init_args = (recording.to_dict(), rec_memmaps_dict, dtype, cast_unsigned)
    283 executor = ChunkRecordingExecutor(recording, func, init_func, init_args, verbose=verbose,
    284                                   job_name='write_binary_recording', **job_kwargs)
--> 285 executor.run()

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/spikeinterface/core/job_tools.py:312, in ChunkRecordingExecutor.run(self)
    310                 returns.append(res)
    311         else:
--> 312             for res in results:
    313                 pass
    315 return returns

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/tqdm/notebook.py:258, in tqdm_notebook.__iter__(self)
    256 try:
    257     it = super(tqdm_notebook, self).__iter__()
--> 258     for obj in it:
    259         # return super(tqdm...) will not catch exception
    260         yield obj
    261 # NB: except ... [ as ...] breaks IPython async KeyboardInterrupt

File ~/anaconda2/envs/si_openephys/lib/python3.9/site-packages/tqdm/std.py:1195, in tqdm.__iter__(self)
   1192 time = self._time
   1194 try:
-> 1195     for obj in iterable:
   1196         yield obj
   1197         # Update and possibly print the progressbar.
   1198         # Note: does not call self.update(1) for speed optimisation.

File ~/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/process.py:559, in _chain_from_iterable_of_lists(iterable)
    553 def _chain_from_iterable_of_lists(iterable):
    554     """
    555     Specialized implementation of itertools.chain.from_iterable.
    556     Each item in *iterable* should be a list.  This function is
    557     careful not to keep references to yielded objects.
    558     """
--> 559     for element in iterable:
    560         element.reverse()
    561         while element:

File ~/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/_base.py:609, in Executor.map.<locals>.result_iterator()
    606 while fs:
    607     # Careful not to keep a reference to the popped future
    608     if timeout is None:
--> 609         yield fs.pop().result()
    610     else:
    611         yield fs.pop().result(end_time - time.monotonic())

File ~/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/_base.py:439, in Future.result(self, timeout)
    437     raise CancelledError()
    438 elif self._state == FINISHED:
--> 439     return self.__get_result()
    441 self._condition.wait(timeout)
    443 if self._state in [CANCELLED, CANCELLED_AND_NOTIFIED]:

File ~/anaconda2/envs/si_openephys/lib/python3.9/concurrent/futures/_base.py:391, in Future.__get_result(self)
    389 if self._exception:
    390     try:
--> 391         raise self._exception
    392     finally:
    393         # Break a reference cycle with the exception in self._exception
    394         self = None

ValueError: could not broadcast input array from shape (0,62) into shape (1677,62)
samuelgarcia commented 2 years ago

Hi, this resample is quite new and well tested yet. Sorry for that. Could you try to run the same with n_jobs=1 and then send the error trace ?

mfvd commented 2 years ago

Hi @samuelgarcia No problem! With n_jobs=1 I don't get any error. However, saving the binary file takes a long time, even with a significantly reduced recording (e.g. 100sec). Saving the whole file with n_jobs=1 crashes my jupyter notebook after ~1h.

samuelgarcia commented 2 years ago

Can you try this

rec_resample = rec_resample.save(format='binary',
                          folder="lfp/preprocessed",
                          n_jobs=1,
                          chunk_duration='1s',
                          progress_bar=False)

And copy us the trace ?

alejoe91 commented 1 year ago

@mfvd do you still have this issue?

mfvd commented 1 year ago

@alejoe91 I don't see the issue anymore. Thanks!