SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
524 stars 187 forks source link

Assert key not in self.signals_info_dict in se.read_spikeglx in spikeinterface 0.98.1 #2455

Open Shrh627 opened 9 months ago

Shrh627 commented 9 months ago

Hi, I am currently using GLX extractor, for posterior kilosort3 processing. The spikeinterface version is 0.98.1. The read_spikeglx function throws the following error.

In [6]: ls /mnt/boninlab/boninlabwip2024/data/ephys/shahriar/Raw/Chronic/SH050/230911/trof/
sh0123br200bB_g0_tcat.imec0.ap.bin*   sh0123br200_bigopenfield_dim_g0_tcat.imec0.ap.bin*
sh0123br200bB_g0_tcat.imec0.ap.meta*  sh0123br200_bigopenfield_dim_g0_tcat.imec0.ap.meta*

 In [7]: spikeinterface.__version__
Out[7]: '0.98.1'

In [8]: recordings = se.read_spikeglx( Path("/mnt/boninlab/boninlabwip2024/data/ephys/shahriar/Raw/Chronic/SH050/230911/trof/") , stream_id = "imec0.ap")

---------------------------------------------------------------------------

AssertionError                            Traceback (most recent call last)

Cell In[8], line 1

----> 1 recordings = se.read_spikeglx( Path("/mnt/boninlab/boninlabwip2024/data/ephys/shahriar/Raw/Chronic/SH050/230911/trof/") , stream_id = "imec0.ap")

File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/spikeinterface/extractors/neoextractors/spikeglx.py:51, in SpikeGLXRecordingExtractor.__init__(self, folder_path, load_sync_channel, stream_id, stream_name, all_annotations)

49 def __init__(self, folder_path, load_sync_channel=False, stream_id=None, stream_name=None, all_annotations=False):

50     neo_kwargs = self.map_to_neo_kwargs(folder_path, load_sync_channel=load_sync_channel)
---> 51     NeoBaseRecordingExtractor.__init__(
     52         self, stream_id=stream_id, stream_name=stream_name, all_annotations=all_annotations, **neo_kwargs
     53     )
     55     # open the corresponding stream probe for LF and AP
     56     # if load_sync_channel=False
     57     if "nidq" not in self.stream_id and not load_sync_channel:
File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/spikeinterface/extractors/neoextractors/neobaseextractor.py:185, in NeoBaseRecordingExtractor.__init__(self, stream_id, stream_name, block_index, all_annotations, use_names_as_ids, **neo_kwargs)
    156 def __init__(
    157     self,
    158     stream_id: Optional[str] = None,
   (...)
    163     **neo_kwargs: Dict[str, Any],
    164 ) -> None:
    165     """
    166     Initialize a NeoBaseRecordingExtractor instance.
    167 
   (...)
    182 
    183     """
--> 185     _NeoBaseExtractor.__init__(self, block_index, **neo_kwargs)
    187     kwargs = dict(all_annotations=all_annotations)
    188     if block_index is not None:
File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/spikeinterface/extractors/neoextractors/neobaseextractor.py:25, in _NeoBaseExtractor.__init__(self, block_index, **neo_kwargs)
     23 def __init__(self, block_index, **neo_kwargs):
     24     if not hasattr(self, "neo_reader"):  # Avoid double initialization
---> 25         self.neo_reader = self.get_neo_io_reader(self.NeoRawIOClass, **neo_kwargs)
     27     if self.neo_reader.block_count() > 1 and block_index is None:
     28         raise Exception(
     29             "This dataset is multi-block. Spikeinterface can load one block at a time. "
     30             "Use 'block_index' to select the block to be loaded."
     31         )
File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/spikeinterface/extractors/neoextractors/neobaseextractor.py:64, in _NeoBaseExtractor.get_neo_io_reader(cls, raw_class, **neo_kwargs)
     62 neoIOclass = getattr(rawio_module, raw_class)
     63 neo_reader = neoIOclass(**neo_kwargs)
---> 64 neo_reader.parse_header()
     66 return neo_reader
File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/neo/rawio/baserawio.py:179, in BaseRawIO.parse_header(self)
    166 def parse_header(self):
    167     """
    168     This must parse the file header to get all stuff for fast use later on.
    169 
   (...)
    177 
    178     """
--> 179     self._parse_header()
    180     self._check_stream_signal_channel_characteristics()
File /opt/anaconda/anaconda3/envs/spikeinterface_0.98.1/lib/python3.10/site-packages/neo/rawio/spikeglxrawio.py:94, in SpikeGLXRawIO._parse_header(self)
     91 for info in self.signals_info_list:
     92     # key is (seg_index, stream_name)
     93     key = (info['seg_index'], info['stream_name'])
---> 94     assert key not in self.signals_info_dict
     95     self.signals_info_dict[key] = info
     97     # create memmap
AssertionError: 
In [9]: 

As shown in the ls output above, there are two data and two meta files in the input directory, which I aim to concatenate afterwards via: spikeinterface.concatenate_recordings

Could you please help me to understand what the problem is? I’ve also attached the meta file.

Your assistance is greatly appreciated.

Thank you, Shahriar. sh0123br200bB_g0_tcat.imec0.ap.meta.txt sh0123br200_bigopenfield_dim_g0_tcat.imec0.ap.meta.txt

alejoe91 commented 9 months ago

Hi,

This problem is related to NEO. Can you open the same issue there? It might be a parsing problem.

Cheers, Alessio

alejoe91 commented 9 months ago

Yeah I think that the problem is the tcat notation. The parser expects a t{integer} in the name. How did you generate those files? Did you rename them manually?

Shrh627 commented 9 months ago

These files are the output of CatGT, and I didn't rename them manually. Although I manually removed the tcat notation from the file name, I encountered the same error.

alejoe91 commented 9 months ago

Ok, we'll try to push a fix on our side. In the meanwhile (to make it work), you can just replace tcat with t0

Shrh627 commented 9 months ago

I replaced tcat with t0, but the error persists, remaining unchanged

alejoe91 commented 9 months ago

I see. Let's move the discussion to NEO. please open an issue there

cfshang commented 5 months ago

Hi, I am happening to the same problem and hope yours have been settled down. My stream_id is "Pt01.imec0.lf" but the warning turns out to be "key (0, 'imec0.ap') is already in the signals_info_dict". Do you have any suggestions?

tabedzki commented 5 months ago

@cfshang can you provide the names of all the files in your case to provide a better understanding of your situation?

a-zmz commented 5 months ago

hi, i got the same issue.

basically, in the same base folder, i have four raw files: name_g0_t0.imec0.ap.bin, name_g0_t0.imec0.lf.bin, and imec1. then i have catgt output folder catgt_name_g0, in which i have bunch of files with name_g0_tcat.imec0.ap.bin, name_g0_tcat.imec0.lf.bin, and imec1.

However, the only file i want to read with read_spikeglx is name_g0_t0.imec0.ap.bin, and i pass this name to stream_name, problem persists:

KeyError: (1, 'imec0.lf')

*running spikeinterface 0.101.0

cfshang commented 4 months ago

@tabedzki @a-zmz Sorry to be late to reply. I have got it resolved by following someones's suggestion to return to the parent folder, like removing the \Pt01\ in MainDir below. You may have a try.

MainDir='E:\xxxxx\Pt01\';

FileLoad= dir([MainDir,'*','Pt01.imec0.ap.bin*']); %AP

martuser commented 4 months ago

Hi, I had the exact same error: key (0, 'imec0.ap') is already in the signals_info_dict". While debugging the code, I noticed that the spikeglxneo function not only looks in the specified folder but also in its subfolders. If you have more ap or lf files in those subfolders, it throws that error. What I did was place my imec0.ap and imec0.lf data in a folder called RawData where there are no other files or folders. This way, it doesn't give an error.

Hope this helps!