Unable to open file with phy for manual curation

JuanPimientoCaicedo commented 8 months ago

Hi, guys.

I was previously working with the waveform extractor object and now I updated the library to start working with the sorting analyzer and kilosort4. I was able to spikesort with Kilosort 4 and created the phy folder using the export_to_phy function. However, when trying to open either the sorter output directly or the phy folder created by using export_to_phy I get the following messages:

from the kilosort 4 output folder:

16:01:27.906 [E] init:62 An error has occurred (IndexError): index 580 is out of bounds for axis 0 with size 579 Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:\Users\Juan.conda\envs\phy2\Scripts\phy.exe__main.py", line 7, in sys.exit(phycli()) ^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\core.py", line 1157, in call return self.main(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\core.py", line 783, in invoke return callback(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\click\decorators.py", line 33, in new_func return f(get_current_context(), args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phy\apps__init__.py", line 159, in cli_template_gui template_gui(params_path, kwargs) File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phy\apps\template\gui.py", line 209, in template_gui model = load_model(params_path) ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 1433, in load_model return TemplateModel(**get_template_params(params_path)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 339, in init self._load_data() File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 419, in _load_data self.sparse_clusters = self.cluster_waveforms() ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 1319, in cluster_waveforms mean_waveform = self.get_cluster_mean_waveforms(clust, unwhiten=False) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 1215, in get_cluster_mean_waveforms template = self.get_template(best_template, unwhiten=unwhiten) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 959, in get_template return self._get_template_dense( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Juan.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 881, in _get_template_dense template_w = self.sparse_templates.data[template_id, ...]


  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\numpy\core\memmap.py", line 335, in __getitem__
    res = super().__getitem__(index)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: index 580 is out of bounds for axis 0 with size 579

QWidget: Must construct a QApplication before a QWidget

from the export_to_phy folder:

An error has occurred (ValueError): attempt to get argmax of an empty sequence
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\Juan\.conda\envs\phy2\Scripts\phy.exe\__main__.py", line 7, in <module>
    sys.exit(phycli())
             ^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\click\decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\__init__.py", line 159, in cli_template_gui
    template_gui(params_path, **kwargs)
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\template\gui.py", line 218, in template_gui
    gui = controller.create_gui()
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\base.py", line 1700, in create_gui
    self.supervisor.attach(gui)
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\cluster\supervisor.py", line 950, in attach
    self._create_views(
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\cluster\supervisor.py", line 765, in _create_views
    gui, data=self.cluster_info, columns=self.columns, sort=sort)
              ^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\cluster\supervisor.py", line 925, in cluster_info
    return [self.get_cluster_info(cluster_id) for cluster_id in self.clustering.cluster_ids]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\cluster\supervisor.py", line 925, in <listcomp>
    return [self.get_cluster_info(cluster_id) for cluster_id in self.clustering.cluster_ids]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\cluster\supervisor.py", line 750, in get_cluster_info
    out[key] = func(cluster_id)
               ^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\base.py", line 1194, in get_best_channel_label
    return self._get_channel_labels([self.get_best_channel(cluster_id)])[0]
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\utils\context.py", line 154, in memcached
    out = f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\base.py", line 1188, in get_best_channel
    channel_ids = self.get_best_channels(cluster_id)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\utils\context.py", line 154, in memcached
    out = f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phy\apps\template\gui.py", line 151, in get_best_channels
    template = self.model.get_template(template_id)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 957, in get_template
    return self._get_template_sparse(template_id, unwhiten=unwhiten)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\phylib\io\model.py", line 924, in _get_template_sparse
    best_channel = channel_ids[np.argmax(amplitude)]
                               ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\numpy\core\fromnumeric.py", line 1229, in argmax
    return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Juan\.conda\envs\phy2\Lib\site-packages\numpy\core\fromnumeric.py", line 59, in _wrapfunc
    return bound(*args, **kwds)
           ^^^^^^^^^^^^^^^^^^^^
ValueError: attempt to get argmax of an empty sequence

I do not know what is the problem here, for your information, I can still open the GUI when curating sessions sorted with Kilosort 3 and exported with the previous waveform Extractor object as an input for the export_to_phy function

alejoe91 commented 8 months ago

@JuanPimientoCaicedo thanks for reporting this! I'll look into this!

zm711 commented 8 months ago

@alejoe91 wasn't this the error with kilosort generating some empty units. There's a helper function for that, but I forget it's name (for the export_to_phy error-- the other error is appearing multiple times on the Kilosort4 issue tracker so I think that is a KS4 problem).

alejoe91 commented 8 months ago

@zm711 can you run Phy with the SortingAnalyzer?

if the problem is about empty units, @JuanPimientoCaicedo you can easily get rid of them with:

sorting_no_empty = sorting.remove_empty_units()

JuanPimientoCaicedo commented 8 months ago

Hi, I already did that and the problem was not solved.

zm711 commented 8 months ago

I haven't tried yet. I've been trying to do a multisorter consensus based approach and slowly get rid of manual curation.... but I can check with a dataset later. I haven't tested KS4 yet, so I'll do a dataset from KS3/MS5 and see what happens with the SortingAnalyzer.

zm711 commented 8 months ago

Actually, I remembered incorrectly I tracked this issue https://github.com/cortex-lab/phy/issues/1233. We discovered that this is a problem with sparsity settings during export. Also seen here in spikeinterface. Could you try the suggestion in #1945 and let us know what shows up there.

JuanPimientoCaicedo commented 8 months ago

Looks like a similar problem, thank you for your response. I will try again and I will let you know what the output is.

JuanPimientoCaicedo commented 8 months ago

Hey, guys. I have not checked the phy compatibility yet. however, when creating the sorting analyzer with a Kilosort 3 sorting object I got an assertion error where sorting._sampling_frequency == recording.sampling_frequency returned a False value. This was caused by the run_sorter function where the sampling frequency got rounded to two decimal places. I solved it with this line of code:

recording.sampling_frequency
Out[1]: 30000.17907949791
sorting._sampling_frequency
Out[2]: 30000.18

sorting._sampling_frequency = recording.sampling_frequency

This is because you take the ADC sampling rate, which is a good practice in my opinion. I am sharing this so you are aware.

samuelgarcia commented 8 months ago

On some serialization we can losse some a bit of precision on the sampling rate when running sorter. We should relax a bit this assertion in the code. Thanks for the feedback.

JuanPimientoCaicedo commented 8 months ago

Hi, I was running some tests in sections of my data (1 hour long Neuropixel recordings). I spike sorted with KS3 a 5 min sample of this data and the sparsity of the units is 20 channels, do you think this sparsity could cause phy to crash?:

JuanPimientoCaicedo commented 8 months ago

Hello, again. After trying different things I have some output:

I tried to see if the problem was Kilosort 4 related by sorting with KS3 and using the new sorting analyzer object. When trying to use the phy app with the KS3 output data directly, I am getting good units.

However, I noticed that when using the analyzer.compute("waveforms") extension. I am not getting real waveform traces. just random traces.

I don't know if I am doing something wrong, so I am sharing with you an HTML version of the written code + output.

Spikesorting.zip

At this step, I am using spikeinterface to do some precuration based on the amplitude of the units and the ISI violations (rp_violation in SI). That is why I am interested in the export_to_phy functionality of this new Spikeinterface version.

Thank you for your time!

samuelgarcia commented 8 months ago

Hi, thanks you for the feedback. The new SortingAnalyzer has a faster way to estimate the sparsity but maybe we have a bug in it!!!

When you do create_sorting_analyzer() you can play sparsity=True/False when True the sparsity is estimated. Note that you can bypass this by given you own etimated sparsity using sparsity=....

Could you try to estimated externally the sparsity using sparsity = estimate_sparsity(recording, sorting, num_spikes_for_sparsity=XXX) and using more spike for this and propagate it to create_sorting_analyzer(..., sparsity=sparsity)

alejoe91 commented 8 months ago

@JuanPimientoCaicedo

Can you try to plot different units? Could it be that the one you're plotting is particularly noisy?

alejoe91 commented 8 months ago

@JuanPimientoCaicedo I wasn't able to reproduce the issue.

Can you share the exported phy folder in drive?

zm711 commented 8 months ago

With regards to KS4 it seems like there are still some compatibility issues with phy that they are working on see this issue. This will need to be fixed at the KS4 level, but it seems like they are working on it.

JuanPimientoCaicedo commented 8 months ago

Hello, the odd templates in the sorting_analyzer were caused by me. I was using a slice of the recording to run the spike sorting, and I was trying to get the waveforms from the whole recording. If the slice of the recording is not obtained from the beginning of the main one, the spike times have a time offset that I did not take into account. Sorry for that.

Now I am getting good units.

Regarding the export_to_phy function I will eliminate the environment and clone the github repository again. the version that I have is from a few days ago, maybe that has something to do with it.

Thank you for all your help again. If I am not able to solve it, I have no problem in sharing some of the data for debugging purposes, does a Onedrive link work for you? I generally export the binary file as well with the argument: copy_binary=True. That causes the folder to have a size of 80 to 90 GB.

zm711 commented 8 months ago

@JuanPimientoCaicedo ,

I think what Sam was suggesting above was to try to also do this with sparse=False. The export_to_phy function hasn't been updated really, so redoing a spikeinterface environment shouldn't be necessary. You could also install in editable mode in the future and just pull changes. As far as KS4, they are doing tons of small updates so might also be beneficial to install in editable mode and pull their changes in the future.

Could you summarize what your current issue now that you've solved the waveforms in the sorting_analyzer?

Which versions of KS are you doing this for and whether or not they worked
Are you able to do phy on the native output vs the spikeinterface export_to_py

Once we are all up-to-date on what the problem areas are then we can try to help more here. As far as sharing data I think onedrive or googledrive would work for looking at stuff although you probably wouldn't need to share the binary (unless you want it @alejoe91). So you could share the folder without the binary.

JuanPimientoCaicedo commented 8 months ago

Hello, again. Sorry for the delay.

I have some new things to share.

I used the current docker images of KS3 and KS4 to do the spike sorting. I believe that the sparsity problem might be caused by the docker image of KS3. I tried to get the traces from the raw file directly with the spike times obtained and this is what I got.

KS3:

KS4:

My hypothesis here, is that the spike times obtained with the KS3 docker version are somehow shifted. I will try to run the spikesorting with the new version of KS3 that solves the spikes holes locally in my matlab.

alejoe91 commented 8 months ago

The docker image for KS3 has been updated after the spike hole bug fix

JuanPimientoCaicedo commented 8 months ago

Hello again. The final configuration that worked for me was:

locally run KS4 using spike interface run_sorter function. I was able to use to sorting analyzer option with this function:

job_kwargs = dict(n_jobs=-1, chunk_duration='1s', progress_bar=True)

analyzer = si.create_sorting_analyzer(sorting, recording, format='binary_folder',
                                      folder=base_folder / "analyzer", sparse=True, **job_kwargs)

and with this function I exported the output to phy

si.export_to_phy(analyzer, output_folder=base_folder / 'phy_KS4',
                 compute_amplitudes=True, compute_pc_features=True, copy_binary=True,
                 **job_kwargs)

I did not have problems running KS4 from Spikeinterface and a 1 hour long recording took about 40 minutes.

I did have to pull the KS4 repo from github today.

SpikeInterface / spikeinterface

Unable to open file with phy for manual curation #2561