spikeinterface.extractors.read_nwb() disrespecting the channel map

nwatters01 commented 8 months ago

I'm running into what seems to be a bug with spikeinterface's spikeinterface.extractors.read_nwb() function:

I'd like to load a neuropixel recording from our NWB ecephys files.
Our NWB ecephys file has several ElectricalSeries: A VP ElectricalSeries for each v-probe, and AP and LF for the neuropixel.
I can load the neuropixels recording with this: neuropixels_recording = se.read_nwb(nwb_filepath, load_recording=True, electrical_series_path='acquisition/ElectricalSeriesAP')
However, when I do that the channel locations and IDs are wrong, namely neuropixels_recording.channel_ids and neuropixels_recording.get_channel_locations() are not correct. Specifically, the IDs and locations start with 64 V-probe electrodes and then follow with 320 neuropixel electrodes, instead of reading the 384 neuropixel electrodes. I'm guessing read_nwb reads the channel locations and IDs from the top of the electrodes table, and doesn't take into account which ElectricalSeries is being loaded.

To reproduce this, first download an ...ecephys.nwb file that has both neuropixels and V-Probe recordings from our dandiset (dandiset ID 000620), for example this asset for session Perle/2022-06-01: https://api.dandiarchive.org/api/assets/c7854da2-1b96-44ff-b9a4-de9eee207535. Then run the following:

import spikeinterface.extractors as se
recording = se.read_nwb(
    $path_to_nwb_ecephys_file,
    load_recording=True,
    electrical_series_path='acquisition/ElectricalSeriesAP',
)
print(recording.channel_ids)

You will see that the channel ids mix across different probes, despite only reading the electrical series from one probe. Similarly for recording.get_channel_locations() and other channel attributes.

By the way, I've tried the following work-arounds which don't work:

Load the entire recording and then use .split_by() to extract only the neuropixels recording. That hits an error ValueError: Multiple ElectricalSeries found in the file.
After loading, use recording.rename_channels() to rename the channels. That hits this asset error.
Reset the probe after loading with recording.set_probe(get_neuropixels_probe.get_neuropixels_probe()). That unfortunately does not change the channel ids or channel locations.

The only workaround I've found that successfully changes the channel locations is to include this block after loading the recording:

new_channel_ids = np.array(
    [str(i) for i in range(len(recording.channel_ids))],
    dtype='<U5'
)
recording = recording.set_probe(get_neuropixels_probe.get_neuropixels_probe())
recording._main_ids = new_channel_ids

This works, but is a pretty hacky solution and doesn't address the main issue.

Thank you!

alejoe91 commented 8 months ago

@nwatters01 can you specify what SI version you're using? Is it 0.100.0?

alejoe91 commented 8 months ago

Ok, I was able to reproduce the issue.

Unfortunately, I don't think it's a reading issue, but rather a writing problem. The way SpikeInterface reads the electrodes associated to the traces is by reading the electrical_series.electrodes, which should be a reference to the electrodes the electrical series corresponds to in thenelectrodes table.

In the file you mentioned, the electrical_series.electrodes points to the wrong electrodes, containing both Vprobe and NP ones.

@CodyCBakerPhD we should check where things are off on the conversion side.

nwatters01 commented 8 months ago

Thanks for looking into this!

There's a chance this may be due to something I did. I split the NWB files into spike sorting data and task/behavior data. I think I copied everything faithfully, but I may messed something up.

Either way, now that I know the issue I can fix it post-hoc. I'll go ahead and do that for my ecephys nwb files.

alejoe91 commented 8 months ago

@nwatters01 a heads up. We discovered that there was indeed a bug in neuroconv, see https://github.com/catalystneuro/neuroconv/pull/784 Anyways, I don't think this is the reason of the wrong electrodes here. The bug that we fixed was writing a wrong ElectricalSeries.electrodes table only when you had the same channel names with different group names (e.g., AP1, AP2 .. for two different probes). In your case, the channel names are all unique (0, 1, 2, ... - AP1, AP2, ... - LF1, LF2, ...), so I think it must be something else.

I'll close the issue here, since it's a problem with the NWB file and not with the SpikeInterface parsing.

h-mayorquin commented 8 months ago

I think that the error here was related to the previous bug in neuroconv where the metadata was not mapped properly to electrodes with the SpikeGLXConverterPipe.

So If you:

Used SpikeGLXConverterPipe instead of the interface directly, and.
You ran the conversion before December 2023 when this was merged

Then that's probably the case. Using the latest version of the SpikeInterface should sort the issue.

SpikeInterface / spikeinterface

spikeinterface.extractors.read_nwb() disrespecting the channel map #2588