SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
533 stars 187 forks source link

Error when saving concatenate_recordings -> file.truncate(file_size_bytes) #2509

Closed chenhongbiao closed 9 months ago

chenhongbiao commented 9 months ago

Hi all, first of all, thank you for building up this systematic framework. It's pretty cool! Although the issue has been reported on #1917 and earily on #1825, here I would like to provide a way to reproduce it.

What happened?

The output of concatenate_recordings(), a concatenated recording object, cannot be saved in the disk due to some file size calculation issues. For example,

si_record_obj_concat = si_core.concatenate_recordings([si_record_obj, si_record_obj])
print(si_record_obj)
NwbRecordingExtractor: 384 channels - 30.0kHz - 1 segments - 141,969,007 samples 
                       4,732.30s (1.31 hours) - int16 dtype - **101.54 GiB**
print(si_record_obj_concat)
ConcatenateSegmentRecording: 384 channels - 30.0kHz - 1 segments - 283,938,014 samples 
                             9,464.60s (2.63 hours) - int16 dtype - **-978937344.00 B**

Steps to Reproduce

Download the example nwb file sub-npI1_ses-20190413_behavior+ecephys.nwb as mentioned in SI Official_Tutorial_SI_0.96_Oct22

fpath_nwb = Path('C:/4_Media_User/GitHub/spikeinterface_lab/spikeinterface_tutorials/spiketutorials-master/Official_Tutorial_SI_0.99_Nov23/nwb_data_local')
fname_nwb = 'sub-npI1_ses-20190413_behavior+ecephys.nwb'
filename_nwb = fpath_nwb / fname_nwb

si_record_obj = si_extractor.read_nwb(file_path=filename_nwb, load_recording=True, load_sorting=False)
si_record_obj_concat = si_core.concatenate_recordings([si_record_obj, si_record_obj])

fpath_processed_temp_folder = fpath_nwb / 'Temp_preprocessed'
si_record_processed_save = si_record_obj_concat.save(folder=fpath_processed_temp_folder, format='binary')

Error Message

{
    "name": "OSError",
    "message": "[Errno 22] Invalid argument",
    "stack": "---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
Cell In[13], line 3
      1 fpath_processed_temp_folder = fpath_nwb / 'Temp_preprocessed'
----> 3 si_record_processed_save = si_record_obj_concat.save(folder=fpath_processed_temp_folder, format='binary')

File c:\\1_Programs_Install\\Anaconda3\\envs\\si_env\\lib\\site-packages\\spikeinterface\\core\\base.py:779, in BaseExtractor.save(self, **kwargs)
    777     loaded_extractor = self.save_to_zarr(**kwargs)
    778 else:
--> 779     loaded_extractor = self.save_to_folder(**kwargs)
    780 return loaded_extractor

File c:\\1_Programs_Install\\Anaconda3\\envs\\si_env\\lib\\site-packages\\spikeinterface\\core\\base.py:864, in BaseExtractor.save_to_folder(self, name, folder, overwrite, verbose, **save_kwargs)
    861 self.save_metadata_to_folder(folder)
    863 # save data (done the subclass)
--> 864 cached = self._save(folder=folder, verbose=verbose, **save_kwargs)
    866 # copy properties/
    867 self.copy_metadata(cached)

File c:\\1_Programs_Install\\Anaconda3\\envs\\si_env\\lib\\site-packages\\spikeinterface\\core\\baserecording.py:462, in BaseRecording._save(self, format, **save_kwargs)
    459 file_paths = [folder / f\"traces_cached_seg{i}.raw\" for i in range(self.get_num_segments())]
    460 dtype = kwargs.get(\"dtype\", None) or self.get_dtype()
--> 462 write_binary_recording(self, file_paths=file_paths, dtype=dtype, **job_kwargs)
    464 from .binaryrecordingextractor import BinaryRecordingExtractor
    466 # This is created so it can be saved as json because the `BinaryFolderRecording` requires it loading
    467 # See the __init__ of `BinaryFolderRecording`

File c:\\1_Programs_Install\\Anaconda3\\envs\\si_env\\lib\\site-packages\\spikeinterface\\core\\core_tools.py:301, in write_binary_recording(recording, file_paths, dtype, add_file_extension, byte_offset, auto_cast_uint, **job_kwargs)
    298 file_size_bytes = data_size_bytes + byte_offset
    300 file = open(file_path, \"wb+\")
--> **301 file.truncate(file_size_bytes)**
    302 file.close()
    303 assert Path(file_path).is_file()

OSError: [Errno 22] Invalid argument"
}

Environment

alejoe91 commented 9 months ago

Hi @chenhongbiao

Thank you for the clean reproducible steps! If I remember correctly, we fixed this issue in the latest release, 0.100.0.

Can you try to upgrade and confirm that it's gone?

Cheers Alessio

chenhongbiao commented 9 months ago

Thank you for your fast reply! It sloves the issue after I updated Spikeinterface to the version 0.100.0.