Phy fails to load dataset after running export_to_phy due to misshapen amplitudes vector

nikhilchandra commented 2 weeks ago

I used SpikeInterface to run KS4 on a dataset and followed this with si.export_to_phy. I next tried to open the exported dataset in phy with this command:

phy template-gui params.py

But this gave me the following error:

13:30:42.751 [E] __init__:62          An error has occurred (AssertionError):
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\nikhil\miniconda3\envs\phy\Scripts\phy.exe\__main__.py", line 7, in <module>
    sys.exit(phycli())
             ^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\click\decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\phy\apps\__init__.py", line 159, in cli_template_gui
    template_gui(params_path, **kwargs)
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\phy\apps\template\gui.py", line 209, in template_gui
    model = load_model(params_path)
            ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\phylib\io\model.py", line 1440, in load_model
    return TemplateModel(**get_template_params(params_path))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\phylib\io\model.py", line 339, in __init__
    self._load_data()
  File "C:\Users\nikhil\miniconda3\envs\phy\Lib\site-packages\phylib\io\model.py", line 358, in _load_data
    assert self.amplitudes.shape == (ns,)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

QWidget: Must construct a QApplication before a QWidget

I used numpy to load amplitudes.npy as output directly by KS4 and also as output by si.export_to_phy. The direct KS4 version has shape (4803961,) while the si.export_to_phy version has shape (4803961,1). Could this mean there is a bug in SpikeInterface's implementation of export_to_phy?

Thanks.

tagging @cheydrick

samuelgarcia commented 2 weeks ago

@alejoe91 can you look at this !,

alejoe91 commented 2 weeks ago

@nikhilchandra can you try to check I'd this is the issue by saving a squeezed version of the amplitudes?

zm711 commented 2 weeks ago

I'm also super behind on tweaking phy so if you installed phy recently we may have other problems. It's hard to fix the export function until I get phy a bit more stable for recent updates, but I really have no time for it right now.

nikhilchandra commented 2 weeks ago

@alejoe91 My apologies, I was mistaken.

KS4's direct output --> amplitudes.npy --> amplitudes.shape = (4803961,) Phy export --> amplitudes.npy --> amplitudes.shape = (4803947, 1)

So it's misshapen and has the wrong number of elements.

zm711 commented 2 weeks ago

That's weird because I always remember the old kilosort returning things with the extra dimension so Phy should account for this. If I have a few moments I can check one of my old phy/ks 2 or 3 files to see what shape they have. I'm just running something on my computer right now so can't check.

As far as number of spikes, we often see that there are spikes occurring before/after the actual recording time which we filter out. If you look at the Phy terminal output it also mentions there being unclustered spikes (which I think could be related). so the decreased number is likely due to our stricter cutoffs for what to include in the data--although I could be wrong since other users note outright errors if they don't run remove_excess_spikes in some phy data. So happy to hear other's opinions on this.

alejoe91 commented 2 weeks ago

Yeah I think that could be the issue. @nikhilchandra can you try to run spikeinterface.curation.remove_excess_spikes(sorting) on the KS output sorting object before exporting to phy? Do you run other curation steps?

nikhilchandra commented 2 weeks ago

@alejoe91 I will try this and get back to you. No, I did not run any further curation steps -- just tried to load it into Phy after running export_to_phy. It's a problem that has cropped up in recent weeks for us with multiple data sets.

@zm711 But the number of elements in each .NPY file should be the same, right? It looks like some have more than others.

zm711 commented 2 weeks ago

@zm711 But the number of elements in each .NPY file should be the same, right? It looks like some have more than others.

Not quite. Some files are per cluster and others are per spike. So not all files have the same shapes. So it really depends on which files you are comparing. All spike based files should be the same though. Which files are you comparing?

nikhilchandra commented 1 week ago

@zm711 I was just commenting on the assert that breaks, i.e., assert self.amplitudes.shape == (ns,)

@alejoe91 You were correct, running spikeinterface.curation.remove_excess_spikes(sorting) worked. Is this something that should be done automatically, rather than requiring the user to do it?

alejoe91 commented 1 week ago

For sure we should raise an exception if this happens to let users know they have to do this extra step and the sorter is returning out of range spikes

SpikeInterface / spikeinterface

Phy fails to load dataset after running export_to_phy due to misshapen amplitudes vector #3520