Open KristijanArmeni opened 1 week ago
Hi there, and thank you for the detailed issue! (and for the example data too!)
We'd be happy to propagate more options to downstream readers; but I would like to understand a bit more why neo
doesn't detect these as different streams - usually, the conditions you describe here, should count as the definition of a distinct stream right @h-mayorquin?
The current workaround is to simply do conversion separately for groups of channels with different sampling rates (i.e. put them in separate folders).
Glad to hear there is at least some workaround available
If i understand correctly, would lead to several *.nwb files (one for each stream).
It could, but it doesn't have to - the overwrite
argument of .run_conversion
on all data interfaces and converters is False
by default, meaning it tries to append the existing file at the nwbfile_path
However, to avoid collisions in object names being written to the same file, you might have to carefully control the metadata, such as the name
, in each metadata["Ecephys"][es_key]
(with es_key
being specified when you initialize each recording interface)
I don't know if that's a desired behavior or if the NWB standard allows different sampling rates per dataset.
NWB can easily support multiple neurodata objects with different sampling rates; these simply become different ElectricalSeries
objects in this case, each with their own rate
and the data corresponding to that group of channels
@CodyCBakerPhD I will take a closer look but it looks like it differentiates them. Thanks a lot for sharing the data.
Thanks @CodyCBakerPhD for feedback!
but I would like to understand a bit more why neo doesn't detect these as different streams
I don't know Neo and their idea of streams well enough, but it indeed seems reading data only works on a single stream. At least that's what I got from this issue in the neo repo: https://github.com/NeuralEnsemble/python-neo/issues/1292
If I understand things correctly, that would also mean that the stream_name
param in NeuralynxRecordingInterface
(and any of the .get_streams*
methods that use it) might not be useful if this information is ultimately obtained via neo_reader.parse_header()
which always expects a single stream.
NWB can easily support multiple neurodata objects
Thanks! That's awesome and good to know.
I don't know Neo and their idea of streams well enough, but it indeed seems reading data only works on a single stream. At least that's what I got from this issue in the neo repo: https://github.com/NeuralEnsemble/python-neo/issues/1292
If I understand things correctly, that would also mean that the stream_name param in NeuralynxRecordingInterface (and any of the .get_streams* methods that use it) might not be useful if this information is ultimately obtained via neo_reader.parse_header() which always expects a single stream.
That's all accurate and in line with our usage here - what I mean to say is that we would then associate a single instance of a DataInterface
with each individual stream that is detected by neo
(and which you get see via the classmethod
NeuralynxRecordingInterface.get_stream_names
)
So do include all streams for a multi-stream instance of Neuralynx in your NWB conversion, just make several such interface, one for each stream, then bundle them in a converter object to create a single NWB file (+/- naming of things via metadata to avoid certain collisions)
The impression I was getting however is that neo
wasn't properly detecting some of your channel groups as different streams and you needed the additional argument to handle that - is that the case or does neo
identify all the streams you need?
The impression I was getting however is that neo wasn't properly detecting some of your channel groups as different streams and you needed the additional argument to handle that - is that the case or does neo identify all the streams you need?
Yes, I think that's true. Well, from the error message (see below) it seems that neo does identify two streams, but complains if the two streams have different sampling rates. I am not sure if a stream always implies a specific sampling rate or if different streams can also have the same sampling rate (in which case this might have worked).
Say, if I try the same folder as above:
./nwb_bug
├── LAHC1.ncs #2kHz
└── LAHCu1.ncs #32kHz
Then doing this:
from neo.rawio import NeuralynxRawIO
reader = NeuralynxRawIO(dirname="/new_bug") # reader.header attr is None
reader.parse_header() # i think this has to be called to get stream info
Will give the same error message:
Cell In[9], line 1
----> 1 reader.parse_header()
File ~/miniconda3/envs/core/lib/python3.11/site-packages/neo/rawio/baserawio.py:189, in BaseRawIO.parse_header(self)
176 """
177 Parses the header of the file(s) to allow for faster computations
178 for all other functions
179
180 """
181 # this must create
182 # self.header['nb_block']
183 # self.header['nb_segment']
(...)
186 # self.header['spike_channels']
187 # self.header['event_channels']
--> 189 self._parse_header()
190 self._check_stream_signal_channel_characteristics()
191 self.is_header_parsed = True
File ~/miniconda3/envs/core/lib/python3.11/site-packages/neo/rawio/neuralynxrawio/neuralynxrawio.py:392, in NeuralynxRawIO._parse_header(self)
389 ref_chan_ids = signal_channels[signal_channels["stream_id"] == ref_stream_id]["name"]
390 chan_ids = signal_channels[signal_channels["stream_id"] == stream_id]["name"]
--> 392 raise ValueError(
393 "Incompatible section structures across streams: "
394 f"Stream id {ref_stream_id}:{ref_chan_ids} and "
395 f"{stream_id}:{chan_ids}."
396 )
398 if ref_sec_structure is not None:
399 self._nb_segment = len(ref_sec_structure.sects)
ValueError: Incompatible section structures across streams: Stream id 0:['LAHC1'] and 1:['LAHCu1'].
Hmm that sounds like a neo bug to me then, have you raised an issue over there?
Ideally their tools would work directly on source data as it came off the acquisition system unmodified by any subsequent opreations
I haven't raised an issue myself, but I saw this: https://github.com/NeuralEnsemble/python-neo/issues/1292
Rereading it, it seems it might not have to do with different sampling rates, but differences in data segments across two streams (for whatever reason). So, exclude_filename
arg is there to deal with this.
I'd go ahead and let them know about it since you have example data so readily available; it might be a segment issue or it might not? Either way sounds like they need reasons to adjust the header parsing to allow stream detection
We try not to use backdoors too often unless the alternative is very costly
What happened?
When trying to convert a Neuralynx dataset following the snippet from the conversion gallery, the initialization of
NeuralynxRecordingInterface()
fails if the dataset contains channels (.ncs
files) from multiple streams, that is groups of recording channels that have different sampling rates.To my understanding,
NeuralynxRecordingInterface
relies on the_NeoBaseExtractor.get_neo_io_reader(**neo_kwargs)
from thespikeinterface
package which propagates**neo_kwargs
to the dedicated neo reader. And neo readers allow theexclude_filename
argument (see NeuralynxRawIO) where the user can choose to exclude offending recording channels. However,NeuralynxRecordingInterface()
does not expect theexclude_filename
parameter.Currently,
NeuralynxRecordingInterface
allows thestream_name
parameter, but that will fail too since it goes via the above.get_neo_io_reader()
method of the spikeinterface package.Possible fix
Allow
NeuralynxRecordingInterface
to acceptexclude_filename
keyword argument.The current workaround is to simply do conversion separately for groups of channels with different sampling rates (i.e. put them in separate folders). If i understand correctly, would lead to several *.nwb files (one for each stream). I don't know if that's a desired behavior or if the NWB standard allows different sampling rates per dataset.
Steps to Reproduce
For example, in MNE-Python testing data we have a small (< 1MB) Neuralynx testing dataset with one channel sampled at 2kHz (
LAHC1.ncs
) and another one at 32kHz (LAHCu1.ncs
).If I place those two channels into the same folder and run this code I get the traceback pasted below:
Traceback
Operating System
Linux
Python Executable
Python
Python Version
3.11
Package Versions
requirements.txt
Code of Conduct