Closed Virginia9733 closed 1 year ago
Hello, it seems like there is an issue with your meta file - could you show us what is inside your meta file?
Here is an example of meta file where the alignment GUI works: https://ibl.flatironinstitute.org/public/steinmetzlab/Subjects/NR_0027/2022-08-23/001/raw_ephys_data/probe00/_spikeglx_ephysData_g0_t0.imec0.ap.185fa46a-b589-4303-9805-b9ea35c08fb7.meta
Thank you for your reply! I concatenated two sessions together using CatGT. The ap.meta is this:
and the lf.meta is this:
Many thanks!
Hi, would you mind attaching the file itself so we can load it in on our end and adapt our code to work. Thank you
Hi there, the GitHub does not support uploading .meta. Would you mind downloading it from this google drive URL? https://drive.google.com/drive/folders/1IP3SGU6H32oE7BCGKiTfgYAfF2bqvNPc?usp=share_link I have uploaded both .meta for ap and lf. Many thanks!
Hi,
Thanks for sharing the file ! The metadata contains an entry we had never seen, so I've fixed the interpreter. Please update this module in the terminal, with your environment activated:
pip install -U ibl-neuropixel
You should have version 4.0.1
I also took the liberty to include your meta file in our unit tests here: https://github.com/int-brain-lab/ibl-neuropixel/blob/06710d411d0ce6f06a9ae39f2532cc352e9a2d83/src/tests/unit/cpu/fixtures/sample3B_catgt.ap.meta
Let me know if this is not OK, and I'll remove it and create another one !
Hi there,
Thank you so much for your reply! I have re-tried by update the ibl-neuropixel module in the activated ibl_env, however, encountered new error message saying:
AssertionError: Inconsistent number of channels between the params file and the binary dat file
Enclosed with the log reporting the details of the error Error.txt
As for the unit test, since the data is owned by someone else, I am afraid it might not be appropriate to open it to the public, I am sorry!
It's fine, I'll create a blank one.
It seems your binary file doesn't match the meta-data file channel definition. This can happen if the binary file is corrupt or not entirely copied.
Do you know the exact number of bytes of the binary file ? Has the sync channel been removed ?
Thank you for your reply. The bin for ap is 117,844,292,720 bytes, the bin for lf is 9,820,358,240 bytes. We didn't do any modification of the .bin or the .meta. The only thing is using CatGT to concatenate two sessions together.
Ok so it checks out 117_844_292_720 / 385 / 2
is an integer and corresponds to 5101.484533333333
seconds at 30kHz so the meta-data file and the binary are consistent.
It seems it is the phy
reader that has an issue, especially regarding the number of channels.
How many channels are in the kilosortChanMap.mat ?
It is hard to debug this without the dataset, I think I can fake the binary file, but I would need the spike sorting output.
Tomorrow is a bank holiday where I am. Sorry for the delay.
Thank you for your reply! I am not sure how could I find 117_844_292_720 / 385 / 2... How many channels are in the kilosortChanMap.mat ? I think is 257, enclosed with the kilosortChanMAp.mat
https://drive.google.com/file/d/1V_IQZ_PEgQP5SUorcfXwnid-yHanNid_/view?usp=share_link
Dear @oliche I am author of the code used for generation of these bin/meta files.
It seems it is the
phy
reader that has an issue, especially regarding the number of channels.
The indirect evidence that phylib
is not the issue here is the fact that these files can be opened with phy GUI here on our side.
How many channels are in the kilosortChanMap.mat ?
This is also not relevant as far as I can see. The call that fails is this one: https://github.com/int-brain-lab/ibllib/blob/00e91a2df93e46235fe689d4dc69682bf2bcaea7/ibllib/ephys/ephysqc.py#L445
I found that the call will succeed with our data if n_channels_dat
set to 385.
The files we use contain 385 channels (from 0 to 384), the last channel is SYNC
.
Are you sure that this is correct? https://github.com/int-brain-lab/ibllib/blob/00e91a2df93e46235fe689d4dc69682bf2bcaea7/ibllib/ephys/ephysqc.py#L440
Here is code snippet that demonstrate the issue:
from pathlib import Path
# from atlaselectrophysiology.extract_files import extract_data
import spikeglx
from phylib.io import model
NCH_WAVEFORMS = 32 # number of channels to be saved in templates.waveforms and channels.waveforms
if __name__ == '__main__':
ks_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
out_path = Path(r'.')
# extract_data(ks_path, ephys_path, out_path)
bin_file = next(ephys_path.rglob('*.ap.*bin'), None)
meta_file = next(ephys_path.rglob('*.ap.meta'), None)
meta = spikeglx.read_meta_data(meta_file)
print(bin_file)
print(meta_file)
fs = spikeglx._get_fs_from_meta(meta)
print("fs =", fs)
nchannels_from_meta = spikeglx._get_nchannels_from_meta(meta)
print("nchannels_from_meta =", nchannels_from_meta)
sync_trace_indices_from_meta = spikeglx._get_sync_trace_indices_from_meta(meta)
print("sync_trace_indices_from_meta =", sync_trace_indices_from_meta)
nch = (spikeglx._get_nchannels_from_meta(meta) - len(spikeglx._get_sync_trace_indices_from_meta(meta)))
print("nch =", nch)
# this call will succeed (n_channels_dat is equal to 385)
m = model.TemplateModel(dir_path=ks_path,
dat_path=bin_file, # this assumes the raw data is in the same folder
sample_rate=fs,
n_channels_dat=nchannels_from_meta,
n_closest_channels=NCH_WAVEFORMS)
m.describe()
# this call will fail (n_channels_dat set to nch which is calculated above as equal to 384)
print("The next call will fail:")
m_doomed = model.TemplateModel(dir_path=ks_path,
dat_path=bin_file, # this assumes the raw data is in the same folder
sample_rate=fs,
n_channels_dat=nch,
n_closest_channels=NCH_WAVEFORMS)
Here is console output from the snippet above:
Hi, if you pull the latest version phylib (v.2.4.3) we have relaxed this assertion and so it should be possible to now extract the files necessary for the alignment gui
After updating phylib to 2.4.3 the following code:
from pathlib import Path
from atlaselectrophysiology.extract_files import extract_data
ks_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
out_path = Path(r'.')
extract_data(ks_path, ephys_path, out_path)
gives me this:
Traceback (most recent call last):
File "D:\repos\Neuropixels\analysis\int-brain-lab\ibltest1.py", line 16, in <module>
extract_data(ks_path, ephys_path, out_path)
File "d:\repos\neuropixels\analysis\int-brain-lab\iblapps\atlaselectrophysiology\extract_files.py", line 134, in extract_data
ks2_to_alf(ks_path, ephys_path, out_path, bin_file=efile.ap,
File "d:\repos\neuropixels\analysis\int-brain-lab\iblapps\atlaselectrophysiology\extract_files.py", line 124, in ks2_to_alf
ephysqc.spike_sorting_metrics_ks2(ks_path, m, save=True, save_path=out_path)
File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\ibllib\ephys\ephysqc.py", line 410, in spike_sorting_metrics_ks2
c, drift = spike_sorting_metrics(m.spike_times, m.spike_clusters, m.amplitudes, m.depths)
File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\brainbox\metrics\single_units.py", line 846, in spike_sorting_metrics
df_units = quick_unit_metrics(
File "C:\opt\Miniconda3-x86_64\envs\iblenv\lib\site-packages\brainbox\metrics\single_units.py", line 985, in quick_unit_metrics
r.slidingRP_viol[srp['cidx']] = srp['value']
IndexError: index 371 is out of bounds for axis 0 with size 371
Attempting to acquire more details about the problem:
from pathlib import Path
import brainbox as bb
from ibllib.ephys.ephysqc import phy_model_from_ks2_path
from slidingRP import metrics
if __name__ == '__main__':
ks_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0\ks3_1a1fd3a')
ephys_path = Path(r'F:\Neuropixel10_SteveDisk2_CatGT\MUT\181Ms23C\Day08_2021-10-13\supercat_sleep1_g0\sleep1_g0_imec0')
out_path = Path(r'.')
m = phy_model_from_ks2_path(ks_path, ephys_path)
cluster_ids = m.spike_clusters
ts = m.spike_times
amps = m.amplitudes
depths = m.depths
print("cluster_ids:", cluster_ids.shape, cluster_ids.dtype)
print("ts:", ts.shape, ts.dtype)
print("amps:", amps.shape, amps.dtype)
print("depths:", depths)
srp = metrics.slidingRP_all(spikeTimes=ts, spikeClusters=cluster_ids, **{'sampleRate': 30000, 'binSizeCorr': 1 / 30000})
print("len(cidx)=", len(srp['cidx']))
print("len(value)=", len(srp['value']))
Output of the code above:
cluster_ids: (17506987,) int32 ts: (17506987,) float64 amps: (17506987,) float64 depths: None len(cidx)= 371 len(value)= 371
Note that the depth == None
because we use Kilosort3 and output of Kilosort3 does not contain features required to calculate the depth. These features were present in Kilosort2 but explicitly removed from the output of Kilosort3 by it"s author.
Any ideas what is going on here?
Oh yes I remember this.
Phylib then has functions to compute the depths from individual spike waveforms or their principal components. Does KS3 output any of them ?
For reference the phylib function is here: https://github.com/cortex-lab/phylib/blob/efbf8e6e31755321b7add7199fa98a7241dc7bab/phylib/io/model.py#L1063
@oliche The Phylib method you mention cannot compute the depth because self.sparse_features is None
, so the method return None
here:
The reason why self.sparse_features is None
is because _load_features()
cannot find pc_features.npy
file:
The reason why pc_features.npy
absent is because the file was removed from the output of Kilosort3 because:
I know, it's because the clustering is done very differently now. The feature view is going to come back in an upgrade, soon ish.
https://github.com/MouseLand/Kilosort/issues/317#issuecomment-771826325
@DenisPolygalov in the meantime until the features are in the new KS3 upgrade I have fixed the code in the conversion script to load in the computed spike depths (which are just computed from the depth of the clusters) before computing the metrics. (https://github.com/int-brain-lab/iblapps/blob/master/atlaselectrophysiology/extract_files.py#L123-L129)
Let me know if this fixes the problem in computing the metrics that you are seeing. The individual spike depths won't be as accurate as with the features but you can still get an estimate of their location.
@mayofaulkner thank you, but now we have new problem (or extension of the old one):
The file in question does exist and can be open for reading:
The file was created by my code, in order to be able to open this data using phy GUI. Here is the snippet of my code that was used for generation of this file:
...
from phylib.io.model import load_model
...
s_spk_waveforms = os.path.join(s_ks3_out_dir, '_phy_spikes_subset.waveforms.npy')
if os.path.isfile(s_spk_waveforms):
print("WARNING: skip generation of _phy_spikes_subset.*.npy files!")
else:
print("INFO: generating _phy_spikes_subset.*.npy files...")
oc_phy_model = load_model(s_params_py)
oc_phy_model.save_spikes_subset_waveforms(max_n_spikes_per_template=500)
oc_phy_model.close()
if not os.path.isfile(s_spk_waveforms):
print("ERROR: file not found:", s_spk_waveforms)
sys.exit(-1)
here is what happen if I move these previously generated _phy_spikes_subset.*.npy
files into different temporary location invisible for iblapps
Python code:
Would you be able to give me access to the spikes. and clusters. files that you have so I can take a look at the problem? One confirmation question, is this after you have done manual curation in phy?
Hi, update, I managed to replicate the error on my side, so don't worry about the data. We should have the changes released to ibllib by early next week but if you need it sooner you can install ibllib like so and rerun the metrics computation
pip install git+https://github.com/int-brain-lab/ibllib.git@develop
Update, this has now been released so you can install ibllib normally
pip install ibllib --upgrade
Thanks, I can confirm that extract_data()
succeed with our data, but only if ks_path
does not contain file _phy_spikes_subset.waveforms.npy
The fie can be generated by either phy gui or previous execution of the extract_data()
function.
The error in such cases is OSError: [Errno 22] Invalid argument:
thrown from phylib\io\traces.py:559
Also, _phy_spikes_subset.*.npy
files are generated in ks_path
AND copied into out_path
(few Gbs of stuff).
Not sure if this is a bug or a feature, just reporting.
Please feel free to close this issue.
Thanks, I appreciate all the discussion and debugging!
Hi there, I tried to run the extract_data(ks_path, ephys_path, out_path) However, I encountered this error:
ValueError: too many values to unpack (expected 2)
I have already git pull to the latest version.
I used the output from the SpikeGLX folder after concatenating two sessions.
Many thanks!