cortex-lab / phy

phy: interactive visualization and manual spike sorting of large-scale ephys data
BSD 3-Clause "New" or "Revised" License
304 stars 155 forks source link

Incorrect waveforms in Phy #1228

Closed neuronzoo closed 1 month ago

neuronzoo commented 8 months ago

Hi, I've seen a version of this issue before (#1218), but I've tried the solution there and it didn't work for me. I am using Windows 11 and KS3. For some of the datasets that I am trying to load into phy, the gui comes up but the waveforms are off: they are presented, but clearly not the right ones - for a "good" unit, they're noisy and scrambled. The following error appears when I run the command phy template-gui params.py from within the phy2 environment:

QWindowsWindow::setGeometry: Unable to set geometry 1946x600+122+86 (frame: 1962x639+114+55) on QWidgetWindow/"TemplateGUIWindow" on "\.\DISPLAY1". Resulting geometry: 1924x600+122+86 (frame: 1940x639+114+55) margins: 8, 31, 8, 8 minimum size: 1894x115 MINMAXINFO maxSize=0,0 maxpos=0,0 mintrack=1910,154 maxtrack=0,0) QMimeDatabase: Error loading internal MIME data An error has been encountered at line 1 of : Premature end of document.:

If I first try to extract the waveforms by using phy extract-waveforms params.py, this is what I get:

Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "C:...\anaconda3\envs\phy2\Scripts\phy.exe__main.py", line 7, in File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\core.py", line 1157, in call return self.main(*args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\core.py", line 783, in invoke return callback(*args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\click\decorators.py", line 33, in new_func return f(get_current_context(), args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\phy\apps__init.py", line 245, in template_extract_waveforms model.save_spikes_subset_waveforms( File "C:...\anaconda3\envs\phy2\Lib\site-packages\phylib\io\model.py", line 1390, in save_spikes_subset_waveforms export_waveforms( File "C:...\anaconda3\envs\phy2\Lib\site-packages\phylib\io\traces.py", line 658, in export_waveforms writer = NpyWriter(path, shape, dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:...\anaconda3\envs\phy2\Lib\site-packages\phylib\io\traces.py", line 559, in init__ self.fp = open(path, 'wb') ^^^^^^^^^^^^^^^^ OSError: [Errno 22] Invalid argument: 'PATH\TO\KILOSORT\OUTPUT\_phy_spikes_subset.waveforms.npy'

I tried to do the following, with no success:

  1. restarting
  2. deleting and reinstalling the phy env
  3. installing using the yaml file. Seems like I was missing a dependency named imp there..?

Thanks! Aviv

zm711 commented 8 months ago

@neuronzoo,

Yeah this is an issue that Python 3.12 is officially released and they removed imp. So what you need to do is reinstall from the yaml, but when you download the yaml you need to open it and change the line that says python to:

name: phy2
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.11
  - pip
  - git
  - numpy
  - matplotlib
  - scipy
  - h5py
  - pyqt
  - pyopengl
  - pyqtwebengine
  - pytest
  - qtconsole
  - requests
  - responses
  - traitlets
  - dask
  - cython
  - pillow
  - scikit-learn
  - joblib
  - pip:
    - git+https://github.com/cortex-lab/phy.git

I put in a PR to fix the imp deprecation but until it is merged you need to specify a python version less than 3.12. Could you try that instead and let me know how it goes?

neuronzoo commented 8 months ago

Hey, thanks for the reply! I actually tried it yesterday after reading #1226 - the imp problem is gone, but WaveformView is still off - scrambled and weird spikes for all (good) units. the rest of the GUI opens fine.

Here is the output I get for phy template-gui params.py --debug: text1.txt

and this error appears: [30648:30128:1103/093453.766:ERROR:cache_util_win.cc(20)] Unable to move the cache: Access is denied. (0x5) [30648:30128:1103/093453.766:ERROR:cache_util.cc(140)] Unable to move cache folder C:\Users...\AppData\Local\python\QtWebEngine\Default\GPUCache to C:\Users...\AppData\Local\python\QtWebEngine\Default\old_GPUCache_000 [30648:30128:1103/093453.766:ERROR:disk_cache.cc(184)] Unable to create cache [30648:30128:1103/093453.766:ERROR:shader_disk_cache.cc(606)] Shader Cache Creation failed: -2 QWindowsWindow::setGeometry: Unable to set geometry 2134x600+122+86 (frame: 2150x639+114+55) on QWidgetWindow/"TemplateGUIWindow" on "\.\DISPLAY1". Resulting geometry: 1924x600+122+86 (frame: 1940x639+114+55) margins: 8, 31, 8, 8 minimum size: 1892x115 MINMAXINFO maxSize=0,0 maxpos=0,0 mintrack=1908,154 maxtrack=0,0)

And here is what I get for phy extract-waveforms params.py: text2.txt

Maybe relevant - From the Anaconda prompt, I can't get to my KS folder using just "cd path\to\ks\folder" so I do "cd /d path\to\ks\folder"

Thanks!

zm711 commented 8 months ago

@neuronzoo

Maybe relevant - From the Anaconda prompt, I can't get to my KS folder using just "cd path\to\ks\folder" so I do "cd /d path\to\ks\folder"

This is a windows cd issue. The /d means switching drive (c to d etc.). So no worries there. Most instructions are not written for windows so they neglect that fact.

So could you send a tiny screen shot of one of the "failed" waveform views. Also maybe a screen shot of what files are actually in the folder you're trying to analyze (should be like params.py, 'raw_data.dat`, a bunch of numpy files.

And the contents of the params.py file. I'm trying to figure out if this is a file error or something else. I'm wondering mostly about what the dat path says for that.

neuronzoo commented 8 months ago

Hey, definitely-> Here is a screenshot for a relatively good cell, 18K spikes, good separation from noise judging by correlogram and amplitude views, but waveforms look messed up: Phy Screen for a good unit

Kilosort folder content: KS folder

And the params.py file:

dat_path = 'D:/.../Analysis/Sleep/WT-R06/Sleep/TAS/2023-09-20/Record Node 101/experiment1/recording1/continuous/Neuropix-PXI-100.ProbeA-AP/temp_wh.dat'
n_channels_dat = 383
dtype = 'int16'
offset = 0
sample_rate = 30000.
hp_filtered = True
zm711 commented 8 months ago

I see. One thing you could try is using the raw file rather than the whitened file. Your raw file is the continous.dat right? You could edit the dat path to end with continous.dat instead and use the raw waveforms rather than the whitened data. Could you try that and see if you think the waveforms look better.

neuronzoo commented 8 months ago

Hey, still the same result - > for phy template-gui params.py --debug: text3.txt

And for phy extract-waveforms params.py: text4.txt

I tried a fresh new recording session from Kilosort, and it actually works well. Could it be that if I use phy to modify the original kilosort data, that by itself causes the waveform problem?

zm711 commented 8 months ago

Phy by itself won't modify that part of the data. The waveforms need to be pulled from the raw data file or the whitened data file.

To be honest the waveform pictures you sent me don't look completely terrible, but maybe you wanted something cleaner?

The other issue is that it only samples a subset of waveforms at a time, so it may have picked a bad subset to represent your data. There's not too much that can be done about that. Each time you run Phy (I believe) it picks a slightly different subset of waveforms, so it may look better or worse sometimes for a dataset.

I'm a bit more worked that extracting the waveforms is failing. It is giving an os error which makes me think that Phy isn't writing that as a raw string, but instead of as a plain string. This is a known issue on Windows due to spurious escaping occurring. This writing is happening in phylib, which is called inside of phy, but that part of the project hasn't been updated for a while. So if you want to use extract-waveforms you would have to very carefully name your files to prevent escaping.

neuronzoo commented 7 months ago

Hey,

  1. I really don't know - For the same unit, on the first time I ran Phy the waveforms were spike-like with very low noise, and then after doing some curation and reopening (>5 times) I always got low signal-to-noise, variable spikes. I do think there is something going on there..
  2. How would you recommend to name the files to prevent escaping?
zm711 commented 7 months ago

@neuronzoo,

I really don't know - For the same unit, on the first time I ran Phy the waveforms were spike-like with very low noise, and then after doing some curation and reopening (>5 times) I always got low signal-to-noise, variable spikes. I do think there is something going on there..

For 1). yeah, sometimes that happens. If things are weird. I restart my computer. If super weird, I clear the phy caches and restart the computer and see if that helps. But the waveforms themselves should not be overwritten since they are just pulled from the raw file. If you merged a noise cluster with a "good" cluster though then the next time you open it could pull the noise waveforms for that unit since it is suppose to randomly pull waveforms. What type of curation have you been doing-- (ie a lot of merging of clusters)?

How would you recommend to name the files to prevent escaping?

2). The easiest solution is to feed a raw string into python. So r'd:\...\', but unfortunately for the command line I don't believe this is an option. (Feel free to try it though!). Otherwise, you would just need to google for escaping values in python (like \t means tab \n means return etc).

The other potential problem that just hit me is that python has a limit to file name around like ~100 characters or so. Your file path is ~138 characters. Could you actually try to shorten it by making it a little more shallow? An easy test would be something like:

D:/your name/ Analysis/Record Node 101/experiment1/recording1/continuous/npx

This will allow us to test if you are just reaching the limits of file name length in python. (It's happened to me before and it's actually a pain to debug because the errors are often OS errors that give no indication that the file lengths are the problems.)

himahuja commented 4 months ago

I have the same work flow, and the same error. But sadly the character limit is not the issue here, since my file location <100 characters long.

File "C:\Users\ramirezlab\anaconda3\envs\phy2\Lib\site-packages\phylib\io\traces.py", line 559, in __init__
    self.fp = open(path, 'wb')
              ^^^^^^^^^^^^^^^^
OSError: [Errno 22] Invalid argument: 'C:\\tmp\\kilosort3\\sorter_output\\_phy_spikes_subset.waveforms.npy'
zm711 commented 4 months ago

Why do you have .waveforms.npy. I'm wondering if the double . is causing your problem @himahuja ?

zm711 commented 1 month ago

I'll close since the original issue was fixed. Feel free to open new issues if something comes up.