SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
474 stars 183 forks source link

error with running KS4 from spikeinterface #2839

Open paolahydra opened 1 month ago

paolahydra commented 1 month ago

Hi, can you please tell me what is going wrong here? Trying to call KS4 from spikeinterface, I get the error "slice indices must be integers or None or have an index method". Please see code and full error report below.

Using: NPX2.0 (two shanks) spikeinterface 0.100.6 kilosort 4.0.7

Thanks a lot, Paola

recording_raw = se.read_openephys(folder_path=path, stream_id="{}".format(index), load_sync_channel=False) 
fs = recording_raw.get_sampling_frequency()

# preprocessing steps
recording = st.correct_lsb(recording_raw, verbose=1);
recording = st.bandpass_filter(recording, freq_min=300, freq_max=6000)
recording = st.phase_shift(recording) 
[bad_channel_ids, channel_labels] = st.detect_bad_channels(recording=recording)  
recording = recording.remove_channels(remove_channel_ids=bad_channel_ids)  

grouped_recordings = recording.split_by(property='group')
recgrouplist_hpsf = [st.highpass_spatial_filter(recording=grouped_recordings[k]) for k in grouped_recordings.keys()]  
recording_hpsf = si.aggregate_channels(recgrouplist_hpsf)
print(recording_hpsf)

print("Starting KS4")
t_start = time.perf_counter()
sorting_ks4_prop = ss.run_sorter_by_property("kilosort4", recording=recording_hpsf, grouping_property="group", working_folder=ks_folder, verbose=True)
t_stop = time.perf_counter()
elapsed_prop = np.round(t_stop - t_start, 2)
print(f"Elapsed time by property: {elapsed_prop} s")

Error running kilosort4

SpikeSortingError Traceback (most recent call last) Cell In[37], line 117 115 print("Starting KS4") 116 t_start = time.perf_counter() --> 117 sorting_ks4_prop = ss.run_sorter_by_property("kilosort4", recording=recording_hpsf, grouping_property="group", working_folder=ks_folder, verbose=True) 118 t_stop = time.perf_counter() 119 elapsed_prop = np.round(t_stop - t_start, 2)

File ~\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\launcher.py:297, in run_sorter_by_property(sorter_name, recording, grouping_property, working_folder, mode_if_folder_exists, engine, engine_kwargs, verbose, docker_image, singularity_image, sorter_params) 286 job = dict( 287 sorter_name=sorter_name, 288 recording=rec, (...) 293 sorter_params, 294 ) 295 job_list.append(job) --> 297 sorting_list = run_sorter_jobs(job_list, engine=engine, engine_kwargs=engine_kwargs, return_output=True) 299 unit_groups = [] 300 for sorting, group in zip(sorting_list, recording_dict.keys()):

File ~\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\launcher.py:106, in run_sorter_jobs(job_list, engine, engine_kwargs, return_output) 103 if engine == "loop": 104 # simple loop in main process 105 for kwargs in job_list: --> 106 sorting = run_sorter(**kwargs) 107 if return_output: 108 out.append(sorting)

File ~\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\runsorter.py:175, in run_sorter(sorter_name, recording, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, docker_image, singularity_image, delete_container_files, with_output, sorter_params) 168 container_image = singularity_image 169 return run_sorter_container( 170 container_image=container_image, 171 mode=mode, 172 common_kwargs, 173 ) --> 175 return run_sorter_local(**common_kwargs)

File ~\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\runsorter.py:225, in run_sorter_local(sorter_name, recording, output_folder, remove_existing_folder, delete_output_folder, verbose, raise_error, with_output, **sorter_params) 223 SorterClass.set_params_to_folder(recording, output_folder, sorter_params, verbose) 224 SorterClass.setup_recording(recording, output_folder, verbose=verbose) --> 225 SorterClass.run_from_folder(output_folder, raise_error, verbose) 226 if with_output: 227 sorting = SorterClass.get_result_from_folder(output_folder, register_recording=True, sorting_info=True)

File ~\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\basesorter.py:293, in BaseSorter.run_from_folder(cls, output_folder, raise_error, verbose) 290 print(f"{sorter_name} run time {run_time:0.2f}s") 292 if has_error and raise_error: --> 293 raise SpikeSortingError( 294 f"Spike sorting error trace:\n{log['error_trace']}\n" 295 f"Spike sorting failed. You can inspect the runtime trace in {output_folder}/spikeinterface_log.json." 296 ) 298 return run_time

SpikeSortingError: Spike sorting error trace: Traceback (most recent call last): File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\basesorter.py", line 258, in run_from_folder SorterClass._run_from_folder(sorter_output_folder, sorter_params, verbose) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\sorters\external\kilosort4.py", line 201, in _run_from_folder ops = compute_preprocessing(ops, device, tic0=tic0, file_object=file_object) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\kilosort\run_kilosort.py", line 295, in compute_preprocessing whiten_mat = preprocessing.get_whitening_matrix(bfile, xc, yc, nskip=nskip, File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\kilosort\preprocessing.py", line 103, in get_whitening_matrix X = f.padded_batch_to_torch(j) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\kilosort\io.py", line 716, in padded_batch_to_torch X = super().padded_batch_to_torch(ibatch) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\kilosort\io.py", line 518, in padded_batch_to_torch data = self.file[bstart : bend] File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\kilosort\io.py", line 949, in getitem samples = self.recording.get_traces(start_frame=i, end_frame=j, File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\baserecording.py", line 288, in get_traces traces = rs.get_traces(start_frame=start_frame, end_frame=end_frame, channel_indices=channel_indices) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\channelslice.py", line 98, in get_traces traces = self._parent_recording_segment.get_traces(start_frame, end_frame, parent_indices) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\channelsaggregationrecording.py", line 144, in get_traces traces_recording = segment.get_traces( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\preprocessing\highpass_spatial_filter.py", line 189, in get_traces traces, left_margin, right_margin = get_chunk_with_margin( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\recording_tools.py", line 721, in get_chunk_with_margin traces_chunk = rec_segment.get_traces( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\channelslice.py", line 98, in get_traces traces = self._parent_recording_segment.get_traces(start_frame, end_frame, parent_indices) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\channelslice.py", line 98, in get_traces traces = self._parent_recording_segment.get_traces(start_frame, end_frame, parent_indices) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\preprocessing\phase_shift.py", line 92, in get_traces traces_chunk, left_margin, right_margin = get_chunk_with_margin( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\recording_tools.py", line 751, in get_chunk_with_margin traces_chunk = rec_segment.get_traces(start_frame2, end_frame2, channel_indices) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\preprocessing\filter.py", line 132, in get_traces traces_chunk, left_margin, right_margin = get_chunk_with_margin( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\core\recording_tools.py", line 721, in get_chunk_with_margin traces_chunk = rec_segment.get_traces( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\preprocessing\normalize_scale.py", line 27, in get_traces scaled_traces = self.parent_recording_segment.get_traces(start_frame, end_frame, channel_indices).astype( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\preprocessing\normalize_scale.py", line 27, in get_traces scaled_traces = self.parent_recording_segment.get_traces(start_frame, end_frame, channel_indices).astype( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\spikeinterface\extractors\neoextractors\neobaseextractor.py", line 326, in get_traces raw_traces = self.neo_reader.get_analogsignal_chunk( File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\neo\rawio\baserawio.py", line 805, in get_analogsignal_chunk raw_chunk = self._get_analogsignal_chunk(block_index, seg_index, i_start, i_stop, stream_index, channel_indexes) File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\neo\rawio\openephysbinaryrawio.py", line 385, in _get_analogsignal_chunk sigs = sigs[i_start:i_stop, :] File "C:\Users\SNeurobiology\miniconda3\envs\ephys-env\lib\site-packages\numpy\core\memmap.py", line 335, in getitem res = super().getitem(index) TypeError: slice indices must be integers or None or have an index method

Spike sorting failed. You can inspect the runtime trace in E:\paola_temp\E1_M15\NPXData\2024-04-12_17-49-04\Record Node 101\experiment1\recording1\SpikeData_ProbeA\0/spikeinterface_log.json.

JoeZiminski commented 1 month ago

Hi @paolahydra, it seems to be that kilosort is calling the spikeinterface object here with indices i and j that are not the correct format. I am not sure why exactly this is, it is occuring during the preprocessing (whitening) stage (before sorting has property started).

Is your data large / are you able to upload it somewhere? I'd be happy to take a look. In the meantime some things you can try:

1) If you can find where kilosort is installed on your system (pip show kilosort) and navigate to line 948 in kilosort/io.py you can break in (if you are familiar with debugging) or put print(i) and print(j) to see what kilosort is trying to use to index. 2) It might be worth stripping your preprocessing back to basics and trying to run again. If sorting proceeds past the preprocessing stage you know the error is not occuring. It's possible one of the preprocessing steps in SI is interacting with kilosort to cause the error (e.g. removing channels).

jakeswann1 commented 1 month ago

Hi both! I had the same TypeError today using Kilosort 4.0.7 and Spikeinterface 101.0 (from source). The only Spikeinterface functions I used prior to running the sorter were se.read_openephys() and si.concatenate_recordings(). I solved it short-term by reverting to Kilosort 4.0.6, so it must be something in that most recent patch which has caused the error. Happy to post my full code and investigate further when I have time tomorrow, but hopefully this can help initially!

paolahydra commented 1 month ago

Hi @JoeZiminski (and@jakeswann1), I added the following to line 948 in kilosort/io.py as you suggested:

print(i)
print(type(i))
print(j)
print(type(j))
  1. With no preprocessing at all, sorting started just fine. code:

    recording_raw = se.read_openephys(folder_path=path, stream_id="{}".format(index), load_sync_channel=False) # this maps all streams even if you specify only one
    print(recording_raw)
    sorting_ks4_prop = ss.run_sorter_by_property("kilosort4", recording=recording_raw, grouping_property="group", working_folder=ks_folder, verbose=True)

    cropped output:

    OpenEphysBinaryRecordingExtractor: 384 channels - 30.0kHz - 1 segments - 113,098,735 samples 
                                   3,769.96s (1.05 hours) - int16 dtype - 80.89 GiB
    Starting KS4
    ========================================
    Loading recording with SpikeInterface...
    number of samples: 113098735
    number of channels: 192
    numbef of segments: 1
    sampling rate: 30000.0
    dtype: int16
    ========================================
    0
    <class 'int'>
    60061
    <class 'int'>
    1499939
    <class 'numpy.uint64'>
    1560061
    <class 'numpy.uint64'>
    2999939
    <class 'numpy.uint64'>
    3060061
    <class 'numpy.uint64'>
    4499939
  2. I then started to add one by one my preprocessing steps:

    • st.correct_lsb was OK.
    • st.bandpass_filter reproduced the error. The loop stopped at the second iteration when type is 'numpy.uint64' rather than 'int': This is it:
      0
      <class 'int'>
      60061
      <class 'int'>
      1499939
      <class 'numpy.uint64'>
      1560061
      <class 'numpy.uint64'>
      Error running kilosort4
      _(same as before)_

Is this insightful enough? Thanks, Paola

JoeZiminski commented 1 month ago

Thanks @paolahydra! also that's great @jakeswann1 that really helps narrow it down to the most recent version. I think this might be related to this update. @paolahydra could you try going to lines 515 - 517 and changing uint64 to int just to test if this fixes the problem?

(warning! this should just be done for debugging purposes, it may have unintended consequences. After messing with installed packages like this for debugging purposes it is useful to uninstall and reinstall to wipe any changes made to the package you may forget about later). The safest option to get it working for now is to roll-back to 4.0.6.

paolahydra commented 1 month ago

Yes, with those lines changed, it moved past preprocessing (bandpass_filter).

JoeZiminski commented 1 month ago

Thanks @paolahydra. It seems uint64 is being changed to float by addition with an integer, probably in get_chunk_with_margin e.g. here.

This is due to the strange behaviour of numpy unit64 as discussed here.

e.g.

x = np.uint64(0)
arr = np.array([1,2,3])
arr[x]  # okay
arr[x+1] # fail
type(x+1)  # np.float64

I wonder if it is best to see if kilosort are interesting in changing to python int here, or support for uint64 etc. is added to SI? @alejoe91 @samuelgarcia

alejoe91 commented 1 month ago

I think we should add some safe castnig in SI! Thanks for tracking this down @JoeZiminski @jakeswann1 @paolahydra

JoeZiminski commented 1 month ago

Sounds good @alejoe91 I'd be happy to open a PR. I wonder where to put this check. Initially I thought BaseRecordings get_traces()? so the check can only be implemented once. However by this time the uint64 will be cased to float so we don't know what happened by this stage (e.g. if the user did erroneously pass a float).

Alternatively they could go at the top of each get_traces() method, maybe using a generic input validator method on BaseRecording. but this is a bit of a pain as it's duplicated for every individual get_traces(). Alternatively some kind of validator like PyDantic could be used to check type of inputs to all get_traces with a decorator. In theory this is a great approach but I'm not sure how effective in practice.

Batter-Wang commented 1 month ago

Hi! I also try to do spikesorting by group using kilosort4 but spikeinterface raise ValueError('No spikes detected, cannot continue sorting.') ValueError: No spikes detected, cannot continue sorting.

aggregate_sorting = si.run_sorter_by_property(sorter_name='kilosort4', recording=rec,
                                           grouping_property='group',remove_existing_folder=True,
                                           working_folder=output_folder,  #engine='joblib', engine_kwargs={'n_jobs': 2}
                                           nblocks=0, dmin=50,dminx=75,torch_device='cuda')

if I delete #engine='joblib', engine_kwargs={'n_jobs': 2}, it runs well one by one.

Does anyone else have the same problem?

Batter-Wang commented 1 month ago

oh!joblib cannot connect GPU? Could I use spikeinterface run kilosort4 parallelly?

JoeZiminski commented 1 month ago

Hi @Batter-Wang thanks for sharing your bug report. Your problem seems to have a different cause to the one in this issue, would you be okay to make a new issue describing your problem? Thanks!

@alejoe91 I was thinking about this a bit more, and another (easier) approach would be to add a decorator to the get_traces functions that validates and possibly casts their inputs. This is basically the approach that most official validators take, but will be more customisable. The decorator can perform type checks and any other required checks on the key arguments to get_traces and cast if necessary, before passing to the decorated function. What do you think, is it worth adding an issue on this? Otherwise, maybe there is a more strategic place to add the casting.