save_shape=True doesn't save shapes to output file and save_shape=False raises KeyError: 'sample_ind'

litzj01 commented 1 month ago

I am using the latest HS2 on SI and can't get the waveform shapes to save in the output file.

Here my code where save_shape=True: sorter_params = { "chunk_size": None, "rescale": True, "rescale_value": -1280.0, "lowpass": True, "common_reference": "median", "spike_duration": 1.0, "amp_avg_duration": 0.4, "threshold": 8.0, "min_avg_amp": 1.0, "AHP_thr": 0.0, "neighbor_radius": 129, "inner_radius": 100, "peak_jitter": 0.25, "rise_duration": 0.26, "decay_filtering": False, "decay_ratio": 1.0, "localize": True, "save_shape": True, "out_file": "HS2_detected", "left_cutout_time": 0.3, "right_cutout_time": 1.8, "verbose": True, "clustering_bandwidth": 4.0, "clustering_alpha": 4.5, "clustering_n_jobs": -1, "clustering_bin_seeding": True, "clustering_min_bin_freq": 4, "clustering_subset": None, "pca_ncomponents": 2, "pca_whiten": True, }

sorting = ss.run_sorter(sorter_name="herdingspikes", recording=recording1, **sorter_params)

I have save_shape=True, however, the output HS2_sorted.hdf5 only contains these 7 fields ('Sampling', 'centres', 'ch', 'cluster_id', 'data', 'exp_inds', 'times').

I attempted to run the same code with save_shape=False to see if it fixed the issue, however, this caused the sort to fail:

SpikeSortingError                         Traceback (most recent call last)
Cell In[4], [line 1](vscode-notebook-cell:?execution_count=4&line=1)
----> [1](vscode-notebook-cell:?execution_count=4&line=1) sorting = ss.run_sorter(sorter_name="herdingspikes", recording=recording1,**sorter_params) 

File c:\Users\litzj\miniconda\envs\spkInt\Lib\site-packages\spikeinterface\sorters\runsorter.py:216, in run_sorter(sorter_name, recording, folder, remove_existing_folder, delete_output_folder, verbose, raise_error, docker_image, singularity_image, delete_container_files, with_output, output_folder, **sorter_params)
    [205](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:205)             raise RuntimeError(
    [206](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:206)                 "The python `spython` package must be installed to "
    [207](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:207)                 "run singularity. Install with `pip install spython`"
    [208](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:208)             )
    [210](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:210)     return run_sorter_container(
    [211](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:211)         container_image=container_image,
    [212](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:212)         mode=mode,
    [213](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:213)         **common_kwargs,
    [214](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:214)     )
--> [216](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:216) return run_sorter_local(**common_kwargs)

File c:\Users\litzj\miniconda\envs\spkInt\Lib\site-packages\spikeinterface\sorters\runsorter.py:276, in run_sorter_local(sorter_name, recording, folder, remove_existing_folder, delete_output_folder, verbose, raise_error, with_output, output_folder, **sorter_params)
    [274](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:274) SorterClass.set_params_to_folder(recording, folder, sorter_params, verbose)
    [275](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:275) SorterClass.setup_recording(recording, folder, verbose=verbose)
--> [276](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:276) SorterClass.run_from_folder(folder, raise_error, verbose)
    [277](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:277) if with_output:
    [278](file:///C:/Users/litzj/miniconda/envs/spkInt/Lib/site-packages/spikeinterface/sorters/runsorter.py:278)     sorting = SorterClass.get_result_from_folder(folder, register_recording=True, sorting_info=True)

File c:\Users\litzj\miniconda\envs\spkInt\Lib\site-packages\spikeinterface\sorters\basesorter.py:301, in BaseSorter.run_from_folder(cls, output_folder, raise_error, verbose)
...
    sp[0]["spike_shape"] = np.zeros(len(sp[0]["sample_ind"]))
                                        ~~~~~^^^^^^^^^^^^^^
KeyError: 'sample_ind'

Spike sorting failed. You can inspect the runtime trace in d:\BrainwaveDataProcessingAttempts\Primate_040924_macula\Testing_params\herdingspikes_output/spikeinterface_log.json.

I have spython 0.3.13 installed in the environment, so I don't understand what the issue here is either. Any help would be appreciated!

litzj01 commented 3 weeks ago

Hello! Just following up to see if anyone has gotten a chance to look into this. Thanks!

mhhennig commented 3 weeks ago

In the new version I have not implemented to save shapes to the hdf5 file but it seems you want to use this? save_shape=True is a keyword only for the spike detection, this can run without saving the shapes. I will add this again, it's an easy change, so hopefully will get that done very soon!

litzj01 commented 3 weeks ago

We do use the hdf5 file waveform shapes for our analysis and would love to have that feature again. Thank you so much for your support!

b-grimaud commented 2 weeks ago

Seems to be commented out since this commit.

I personally save the spikes DataFrame to .csv out of convenience, but I assume down the line the idea would be to use the SpikeInterface WaveformExtractor ?

TheSuperbohl commented 2 weeks ago

I have the same situation as litzj01. We use the Sorted.hdf5 file from HS, combine it with a stimulus timing file, and filter out subthreshold units between specific timing windows. The filtered dataset is put into MATLAB as a .h5, and all information about a single unit becomes grouped into individual structures for analysis.

litzj01 commented 2 weeks ago

@TheSuperbohl That is pretty much exactly our pipeline as well! May I ask which MEA type you are using? We are using a 3Brain 64x64 MEA.

mhhennig commented 2 weeks ago

Ok, so this should work again!

Once you have run the clustering, you can do Clusters.SaveHDF5('HS2_detected.hdf5') (where Clusters is the clustering object) to store everything. You can also load this again with hs.HSClustering('HS2_detected.hdf5'), this will then have the shapes as well. If you run this via SpikeInterface, it'll for now always store the shapes. Will make a PR shortly to make the option accessible there as well.

To test this, you need to clone this repo, I want to be sure it all works as intended before uploading again to pypi (for pip install). Please let me know if there are any problems.

Aside, this code still tries to support the old legacy data format as we still have some old some recordings stored in this way ... getting increasingly unlikely we'll every look at these again. The code would be cleaner if I could remove this at some point, would this be an issue?

TheSuperbohl commented 1 week ago

@litzj01 yes are using the same 3Brain chips. I would probably be fine if it gets removed at some point. I am currently trying to retool our pipeline so we can use just spike interface. If I need to at some point, I can work on how to take extracted waveform shapes from spike interface and fit into the restructured .h5 file we make for the MATLAB pipeline.

b-grimaud commented 1 week ago

Aside, this code still tries to support the old legacy data format as we still have some old some recordings stored in this way ... getting increasingly unlikely we'll every look at these again. The code would be cleaner if I could remove this at some point, would this be an issue?

neo still supports the old recordings, so it's probably safe to remove direct support here as they will still be easily readable.

mhhennig commented 5 days ago

Great, will clean up the code soon to transition to the new method.

mhhennig / HS2

save_shape=True doesn't save shapes to output file and save_shape=False raises KeyError: 'sample_ind' #81