int-brain-lab / iblapps

pyqt5 dependent applications for IBL sessions
MIT License
23 stars 15 forks source link

ValueError: too many values to unpack (expected 2) #76

Closed Virginia9733 closed 1 year ago

Virginia9733 commented 1 year ago

Hi there, I tried to run the extract_data(ks_path, ephys_path, out_path) However, I encountered this error:

ValueError: too many values to unpack (expected 2)

Screenshot 2022-11-25 195310

I have already git pull to the latest version.

I used the output from the SpikeGLX folder after concatenating two sessions. Screenshot 2022-11-25 195436

Many thanks!

GaelleChapuis commented 1 year ago

Hello, it seems like there is an issue with your meta file - could you show us what is inside your meta file?

Here is an example of meta file where the alignment GUI works: https://ibl.flatironinstitute.org/public/steinmetzlab/Subjects/NR_0027/2022-08-23/001/raw_ephys_data/probe00/_spikeglx_ephysData_g0_t0.imec0.ap.185fa46a-b589-4303-9805-b9ea35c08fb7.meta

Virginia9733 commented 1 year ago

Thank you for your reply! I concatenated two sessions together using CatGT. The ap.meta is this: Screenshot 2022-11-26 113027

and the lf.meta is this: Screenshot 2022-11-26 113103

Many thanks!

mayofaulkner commented 1 year ago

Hi, would you mind attaching the file itself so we can load it in on our end and adapt our code to work. Thank you

Virginia9733 commented 1 year ago

Hi there, the GitHub does not support uploading .meta. Would you mind downloading it from this google drive URL? https://drive.google.com/drive/folders/1IP3SGU6H32oE7BCGKiTfgYAfF2bqvNPc?usp=share_link I have uploaded both .meta for ap and lf. Many thanks!

oliche commented 1 year ago

Hi,

Thanks for sharing the file ! The metadata contains an entry we had never seen, so I've fixed the interpreter. Please update this module in the terminal, with your environment activated:

pip install -U ibl-neuropixel

You should have version 4.0.1

I also took the liberty to include your meta file in our unit tests here: https://github.com/int-brain-lab/ibl-neuropixel/blob/06710d411d0ce6f06a9ae39f2532cc352e9a2d83/src/tests/unit/cpu/fixtures/sample3B_catgt.ap.meta

Let me know if this is not OK, and I'll remove it and create another one !

Virginia9733 commented 1 year ago

Hi there,

Thank you so much for your reply! I have re-tried by update the ibl-neuropixel module in the activated ibl_env, however, encountered new error message saying:

AssertionError: Inconsistent number of channels between the params file and the binary dat file Screenshot 2022-11-30 021608

Enclosed with the log reporting the details of the error Error.txt

Virginia9733 commented 1 year ago

As for the unit test, since the data is owned by someone else, I am afraid it might not be appropriate to open it to the public, I am sorry!

oliche commented 1 year ago

It's fine, I'll create a blank one.

It seems your binary file doesn't match the meta-data file channel definition. This can happen if the binary file is corrupt or not entirely copied.

Do you know the exact number of bytes of the binary file ? Has the sync channel been removed ?

Virginia9733 commented 1 year ago

Thank you for your reply. The bin for ap is 117,844,292,720 bytes, the bin for lf is 9,820,358,240 bytes. We didn't do any modification of the .bin or the .meta. The only thing is using CatGT to concatenate two sessions together. Screenshot 2022-12-01 003536 Screenshot 2022-12-01 003517

Screenshot 2022-12-01 003413

oliche commented 1 year ago

Ok so it checks out 117_844_292_720 / 385 / 2 is an integer and corresponds to 5101.484533333333 seconds at 30kHz so the meta-data file and the binary are consistent.

It seems it is the phy reader that has an issue, especially regarding the number of channels.

How many channels are in the kilosortChanMap.mat ?

It is hard to debug this without the dataset, I think I can fake the binary file, but I would need the spike sorting output.

Tomorrow is a bank holiday where I am. Sorry for the delay.

Virginia9733 commented 1 year ago

Thank you for your reply! I am not sure how could I find 117_844_292_720 / 385 / 2... How many channels are in the kilosortChanMap.mat ? I think is 257, enclosed with the kilosortChanMAp.mat Screenshot 2022-12-05 221943

https://drive.google.com/file/d/1V_IQZ_PEgQP5SUorcfXwnid-yHanNid_/view?usp=share_link

DenisPolygalov commented 1 year ago

Dear @oliche I am author of the code used for generation of these bin/meta files.

It seems it is the phy reader that has an issue, especially regarding the number of channels.

The indirect evidence that phylib is not the issue here is the fact that these files can be opened with phy GUI here on our side.

How many channels are in the kilosortChanMap.mat ?

This is also not relevant as far as I can see. The call that fails is this one: https://github.com/int-brain-lab/ibllib/blob/00e91a2df93e46235fe689d4dc69682bf2bcaea7/ibllib/ephys/ephysqc.py#L445

I found that the call will succeed with our data if n_channels_dat set to 385. The files we use contain 385 channels (from 0 to 384), the last channel is SYNC.

Are you sure that this is correct? https://github.com/int-brain-lab/ibllib/blob/00e91a2df93e46235fe689d4dc69682bf2bcaea7/ibllib/ephys/ephysqc.py#L440

Here is code snippet that demonstrate the issue:

from pathlib import Path
# from atlaselectrophysiology.extract_files import extract_data
import spikeglx
from phylib.io import model

NCH_WAVEFORMS = 32  # number of channels to be saved in templates.waveforms and channels.waveforms

if __name__ == '__main__':
    ks_path    = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
    ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
    out_path   = Path(r'.')
    # extract_data(ks_path, ephys_path, out_path)

    bin_file = next(ephys_path.rglob('*.ap.*bin'), None)
    meta_file = next(ephys_path.rglob('*.ap.meta'), None)
    meta = spikeglx.read_meta_data(meta_file)

    print(bin_file)
    print(meta_file)

    fs = spikeglx._get_fs_from_meta(meta)
    print("fs =", fs)

    nchannels_from_meta = spikeglx._get_nchannels_from_meta(meta)
    print("nchannels_from_meta =", nchannels_from_meta)

    sync_trace_indices_from_meta = spikeglx._get_sync_trace_indices_from_meta(meta)
    print("sync_trace_indices_from_meta =", sync_trace_indices_from_meta)

    nch = (spikeglx._get_nchannels_from_meta(meta) - len(spikeglx._get_sync_trace_indices_from_meta(meta)))
    print("nch =", nch)

    # this call will succeed (n_channels_dat is equal to 385)
    m = model.TemplateModel(dir_path=ks_path,
                            dat_path=bin_file,  # this assumes the raw data is in the same folder
                            sample_rate=fs,
                            n_channels_dat=nchannels_from_meta,
                            n_closest_channels=NCH_WAVEFORMS)
    m.describe()

    # this call will fail (n_channels_dat set to nch which is calculated above as equal to 384)
    print("The next call will fail:")
    m_doomed = model.TemplateModel(dir_path=ks_path,
                            dat_path=bin_file,  # this assumes the raw data is in the same folder
                            sample_rate=fs,
                            n_channels_dat=nch,
                            n_closest_channels=NCH_WAVEFORMS)
DenisPolygalov commented 1 year ago

Here is console output from the snippet above:

Capture

mayofaulkner commented 1 year ago

Hi, if you pull the latest version phylib (v.2.4.3) we have relaxed this assertion and so it should be possible to now extract the files necessary for the alignment gui

DenisPolygalov commented 1 year ago

After updating phylib to 2.4.3 the following code:

from pathlib import Path
from atlaselectrophysiology.extract_files import extract_data
ks_path    = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
out_path   = Path(r'.')
extract_data(ks_path, ephys_path, out_path)

gives me this:

Traceback (most recent call last):
  File "D:\repos\Neuropixels\analysis\int-brain-lab\ibltest1.py", line 16, in <module>
    extract_data(ks_path, ephys_path, out_path)
  File "d:\repos\neuropixels\analysis\int-brain-lab\iblapps\atlaselectrophysiology\extract_files.py", line 134, in extract_data
    ks2_to_alf(ks_path, ephys_path, out_path, bin_file=efile.ap,
  File "d:\repos\neuropixels\analysis\int-brain-lab\iblapps\atlaselectrophysiology\extract_files.py", line 124, in ks2_to_alf
    ephysqc.spike_sorting_metrics_ks2(ks_path, m, save=True, save_path=out_path)
  File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\ibllib\ephys\ephysqc.py", line 410, in spike_sorting_metrics_ks2
    c, drift = spike_sorting_metrics(m.spike_times, m.spike_clusters, m.amplitudes, m.depths)
  File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\brainbox\metrics\single_units.py", line 846, in spike_sorting_metrics
    df_units = quick_unit_metrics(
  File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\brainbox\metrics\single_units.py", line 985, in quick_unit_metrics
    r.slidingRP_viol[srp['cidx']] = srp['value']
IndexError: index 371 is out of bounds for axis 0 with size 371

Attempting to acquire more details about the problem:

from pathlib import Path
import brainbox as bb
from ibllib.ephys.ephysqc import phy_model_from_ks2_path
from slidingRP import metrics

if __name__ == '__main__':
    ks_path    = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
    ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
    out_path   = Path(r'.')

    m = phy_model_from_ks2_path(ks_path, ephys_path)
    cluster_ids = m.spike_clusters
    ts = m.spike_times
    amps = m.amplitudes
    depths = m.depths

    print("cluster_ids:", cluster_ids.shape, cluster_ids.dtype)
    print("ts:", ts.shape, ts.dtype)
    print("amps:", amps.shape, amps.dtype)
    print("depths:", depths)

    srp = metrics.slidingRP_all(spikeTimes=ts, spikeClusters=cluster_ids, **{'sampleRate': 30000, 'binSizeCorr': 1 / 30000})
    print("len(cidx)=", len(srp['cidx']))
    print("len(value)=", len(srp['value']))

Output of the code above:

cluster_ids: (17506987,) int32 ts: (17506987,) float64 amps: (17506987,) float64 depths: None len(cidx)= 371 len(value)= 371

Note that the depth == None because we use Kilosort3 and output of Kilosort3 does not contain features required to calculate the depth. These features were present in Kilosort2 but explicitly removed from the output of Kilosort3 by it"s author.

Any ideas what is going on here?

oliche commented 1 year ago

Oh yes I remember this.

Phylib then has functions to compute the depths from individual spike waveforms or their principal components. Does KS3 output any of them ?

For reference the phylib function is here: https://github.com/cortex-lab/phylib/blob/efbf8e6e31755321b7add7199fa98a7241dc7bab/phylib/io/model.py#L1063

DenisPolygalov commented 1 year ago

@oliche The Phylib method you mention cannot compute the depth because self.sparse_features is None, so the method return None here:

https://github.com/cortex-lab/phylib/blob/efbf8e6e31755321b7add7199fa98a7241dc7bab/phylib/io/model.py#L1072

The reason why self.sparse_features is None is because _load_features() cannot find pc_features.npy file:

https://github.com/cortex-lab/phylib/blob/efbf8e6e31755321b7add7199fa98a7241dc7bab/phylib/io/model.py#L762

The reason why pc_features.npy absent is because the file was removed from the output of Kilosort3 because:

I know, it's because the clustering is done very differently now. The feature view is going to come back in an upgrade, soon ish.

https://github.com/MouseLand/Kilosort/issues/317#issuecomment-771826325

mayofaulkner commented 1 year ago

@DenisPolygalov in the meantime until the features are in the new KS3 upgrade I have fixed the code in the conversion script to load in the computed spike depths (which are just computed from the depth of the clusters) before computing the metrics. (https://github.com/int-brain-lab/iblapps/blob/master/atlaselectrophysiology/extract_files.py#L123-L129)

Let me know if this fixes the problem in computing the metrics that you are seeing. The individual spike depths won't be as accurate as with the features but you can still get an estimate of their location.

DenisPolygalov commented 1 year ago

@mayofaulkner thank you, but now we have new problem (or extension of the old one):

Capture1

The file in question does exist and can be open for reading:

Capture2

The file was created by my code, in order to be able to open this data using phy GUI. Here is the snippet of my code that was used for generation of this file:

...
from phylib.io.model import load_model
...
    s_spk_waveforms = os.path.join(s_ks3_out_dir, '_phy_spikes_subset.waveforms.npy')
    if os.path.isfile(s_spk_waveforms):
        print("WARNING: skip generation of _phy_spikes_subset.*.npy files!")
    else:
        print("INFO: generating _phy_spikes_subset.*.npy files...")
        oc_phy_model = load_model(s_params_py)
        oc_phy_model.save_spikes_subset_waveforms(max_n_spikes_per_template=500)
        oc_phy_model.close()

    if not os.path.isfile(s_spk_waveforms):
        print("ERROR: file not found:", s_spk_waveforms)
        sys.exit(-1)
DenisPolygalov commented 1 year ago

here is what happen if I move these previously generated _phy_spikes_subset.*.npy files into different temporary location invisible for iblapps Python code:

Capture

mayofaulkner commented 1 year ago

Would you be able to give me access to the spikes. and clusters. files that you have so I can take a look at the problem? One confirmation question, is this after you have done manual curation in phy?

mayofaulkner commented 1 year ago

Hi, update, I managed to replicate the error on my side, so don't worry about the data. We should have the changes released to ibllib by early next week but if you need it sooner you can install ibllib like so and rerun the metrics computation

pip install git+https://github.com/int-brain-lab/ibllib.git@develop
mayofaulkner commented 1 year ago

Update, this has now been released so you can install ibllib normally

pip install ibllib --upgrade
DenisPolygalov commented 1 year ago

Thanks, I can confirm that extract_data() succeed with our data, but only if ks_path does not contain file _phy_spikes_subset.waveforms.npy The fie can be generated by either phy gui or previous execution of the extract_data() function.

The error in such cases is OSError: [Errno 22] Invalid argument: thrown from phylib\io\traces.py:559

Also, _phy_spikes_subset.*.npy files are generated in ks_path AND copied into out_path (few Gbs of stuff). Not sure if this is a bug or a feature, just reporting.

Please feel free to close this issue.

Virginia9733 commented 1 year ago

Thanks, I appreciate all the discussion and debugging!