Open loew opened 1 year ago
I might be wrong, but it seems that the channels in the raw file (.snirf) and the channels.tsv file are in different order. This causes _handle_channels_reading to detect mismatches and all names get messed up, ending in unequal numbers of unique channel names.
To tell if this is a MNE-Python or MNE-BIDS problem (it probably is not a MNE-NIRS problem actually!), can you try just mne.io.read_raw_snirf
on the file first? If that fails, it's a problem with MNE-Python. If it succeeds but read_raw_bids
fails, it's (probably) a problem with MNE-BIDS. My guess is that it's an MNE-BIDS problem since it happens in rename_channels
...
mne.io.read_raw_snirf
succeeds.
I believe the problem is that channel names are in different order in the snirf file and the channels.tsv file:
from mne_bids import BIDSPath, read_raw_bids
from mne_nirs.datasets import fnirs_motor_group
from mne_bids.tsv_handler import _from_tsv
from mne.io import read_raw_snirf
bpath = BIDSPath(subject='01', task="tapping", root=fnirs_motor_group.data_path(), \
datatype="nirs", suffix="nirs", extension=".snirf")
raw_snirf = read_raw_snirf(bpath, verbose=False)
channels_snirf = raw_snirf.ch_names
bpath.update(suffix='channels', extension='.tsv')
channels_tsv=_from_tsv(bpath)
print('snirf:', channels_snirf[:5])
print('tsv :', channels_tsv['name'][:5])
snirf: ['S1_D1 760', 'S1_D2 760', 'S1_D3 760', 'S1_D9 760', 'S2_D1 760'] tsv : ['S1_D1 760', 'S1_D1 850', 'S1_D2 760', 'S1_D2 850', 'S1_D3 760']
Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴🏽♂️
FYI @rob-luke @sappelhoff I've transferred this here because I think it's probably an issue with how MNE-BIDS handles channel names. My guess is that the dataset natively in SNIRF has some order that mne.io.read_raw_snirf
reorders (to be in a standard paired-channel order), but then MNE-BIDS somehow does not handle the mismatch that is thereby created with the .tsv
. But this is speculative on my part, TBD if it's actually the case...
Thanks for triaging, @larsoner
I believe the problem is that channel names are in different order in the snirf file and the channels.tsv file:
yes, that's unfortunate ... for EEG, iEEG, and MEG it's a RECOMMENDATION that the order of channels in the raw file and in channels.tsv
is identical, see:
yet for NIRS, I don't even see this recommendation. Was this deliberately omitted @rob-luke, or is that something we should add post-hoc?
Not that the recommendation would really cut any work for us, as it's not a REQUIREMENT and currently we seem to be handling this as if it were, leading to this issue :thinking:
@sappelhoff do you expect it's "just" a matter of sorting the dataframe (or list or whatever) we get from reading the TSV by inst.ch_names.index(ch_name) for ch_name in tsv['ch_name']
or so?
That's what I am thinking right now, but it's been a long time that I've looked into that specific code.
I explored this issue a bit more using my own data. I went from the data in original NIRx format to MNE raw, to snirf and finally to BIDS format. It looks all fine with regard to the channel names.
I seems that the channels.tsv file provided with sample data is the problem. When I read MNE-NIRS sample data with read_raw_snirf
and then use write_raw_bids
to export the data, then the channels.tsv is different from the channels.tsv file that comes with the sample data. The newly created channels.tsv file corresponds exactly to channel names in the data imported using read_raw_snirf
. Maybe @rob-luke could check how he created the BIDS files from his original nirx files?
Below the code to create a new channels.tsv from a snirf file.
from pathlib import Path
import mne_nirs
from mne.io import read_raw_snirf
from mne_bids import BIDSPath, read_raw_bids, write_raw_bids
datapath = mne_nirs.datasets.block_speech_noise.data_path()
bpath = BIDSPath(root=datapath, subject='01', session="01", suffix="nirs", extension=".snirf", task="AudioSpeechNoise", datatype="nirs")
raw_snirf = read_raw_snirf(bpath, verbose=False)
bpath_out = BIDSPath(root=Path.cwd() / 'testchannelstsv', subject='01', session="01", suffix="nirs", extension=".snirf", task="AudioSpeechNoise", datatype="nirs")
write_raw_bids(raw_snirf, bpath_out, overwrite=True)
Hi @loew, thanks for reporting this issue. I believe this is an unintended consequence of https://github.com/mne-tools/mne-python/pull/10642, when MNE changed the way it ordered channels.
Can you post a link to the dataset you are testing so I can verify if this is indeed the issue and how we can fix it. Thanks
Hi @rob-luke, which dataset are you refering to? The ones that have the issue are part of the MNE-NIRS sample data. The test with my own data did not show the issue.
@rob-luke I believe you can find the MWE in the original post, your help in resolving this would be appreciated!
To recap, the problem is that you may have channels 'foo'
and 'bar
' in your raw data, but the channel order is 'bar'
, 'foo
' in channels.tsv
When reading, MNE-BIDS tries to rename each channel to meet the channels.tsv definition, but renaming 'foo'
to 'bar'
will fail because 'bar'
already exists at this stage.
Proposal for a fix:
Do two passes when populating the BIDS channel names
'1 foo'
, '2 bar'
Assign the raw data channels the names from channels.tsv
By this do you mean some raw.rename_channels(...)
? I don't think that's quite the issue here. IIUC tor these data the set(raw.ch_names) == set(channels_tsv['names'])
or whatever, just the order differs. So renaming would do the wrong thing -- either the channels_tsv
data needs to be reordered or the raw instance needs its channels reordered. (Or the people need to update their datasets to have consistent ordering.)
@larsoner I edited my above comment to add more clarity as to how I understand the issue, could you please check if this description makes sense to you?
I think we have an ambiguous situation here in case we have an order mismatch between raw and channels.tsv
In my mind we would keep the original order in raw and assign the names stored in channels.tsv
But another possible approach could be to re-order the channel data in raw such that it matches channels.tsv
In my mind we would keep the original order in raw and assign the names stored in channels.tsv
I still don't quite follow what you mean by "assign" here. You could mean raw.rename_channels({raw_ch_name: csv_ch_name for raw_ch_name, csv_ch_name in zip(raw.ch_names, csv['ch_names']})
which will do the wrong thing. If you mean figuring out and index correspondence, then using that to rename the raw.ch_names
to be csv['full_ch_names'][reorder_idx]
or something this would make sense to me, but I'm not sure what the 'full_ch_names'
or whatever would be (don't know that much about BIDS...).
You could mean
raw.rename_channels({raw_ch_name: csv_ch_name for raw_ch_name, csv_ch_name in zip(raw.ch_names, csv['ch_names']})
which will do the wrong thing.
Would it? I think it's one of two valid approaches. In fact, this was actually what I expected would happen when I first thought about this mismatch.
Would it? I think it's one of two valid approaches.
Thinking about raw.ch_names = ['MEG0111', ..., 'EEG001', ...]
but for whatever reason the person in their channels.tsv
ordered EEG before MEG, the rename_channels
approach would be wrong. I think that's the sort of situation we're in with this fNIRS data.
Thinking about
raw.ch_names = ['MEG0111', ..., 'EEG001', ...]
but for whatever reason the person in theirchannels.tsv
ordered EEG before MEG, therename_channels
approach would be wrong.
Yes, but think they have named the channels in raw
Fpz
TP9
TP10
and realized they swapped TP9 and TP10, they may want to fix the problem by editing channels.tsv:
Fpz
TP10
TP9
(this is what I believe happened in the fNIRS example, but I might be mistaken)
But as I said, it's ambiguous what to do here.
(this is what I believe happened in the fNIRS example, but I might be mistaken)
I don't think so. A while ago we used to enforce writing fNIRS channels in a specific order. I'm guessing they wrote the channels.tsv
back when we did that. Then at some point updated their raw data by rewriting when we didn't force the order to change. Hence the mismatch...
Yes, but think they have named the channels in raw ... and realized they swapped TP9 and TP10, they may want to fix the problem by editing channels.tsv
Coming back to the spec as posted above, this at least mentions the idea of an order mismatch:
Is there somewhere in the spec that says channels.tsv
can be used to do a renaming? I did a quick search in that URL and didn't see anything about a renaming being allowed at least.
Is there somewhere in the spec that says channels.tsv can be used to do a renaming? I did a quick search in that URL and didn't see anything about a renaming being allowed at least.
unfortunately there is still no official BIDS policy on whether the raw data or the BIDS metadata should be preferred in case of a mismatch. This is the relevant issue:
Also, I am very unhappy that this sentence from the spec as linked by Eric above is a "SHOULD" and not a "MUST" 😕
To avoid confusion, the channels SHOULD be listed in the order they appear in the EEG data file.
Mismatch in order of names, or in set of names? I see in the spec where the former could happen (the SHOULD stuff), but not where the sets of names could differ.
my first comment pertains to mismatch in set names, my second comment pertains to the order only (where a mismatch is apparently "fine but not recommended")
Yikes so there's no real way to know. Maybe we should raise an error but allow people to pass ch_name_mismatch='raise' (default) | 'reorder' | 'rename'
. We could even just start with the reorder case as that's the one we actually need here, unless people already have seen datasets that need to rename one.
Yikes so there's no real way to know. Maybe we should raise an error but allow people to pass
ch_name_mismatch='raise' (default) | 'reorder' | 'rename'
. We could even just start with the reorder case as that's the one we actually need here, unless people already have seen datasets that need to rename one.
+1
Describe the bug
Sample NIRS data in the BIDS format can not be read due to some problems with channel names. This applies to the data set in the example below, but can be replicated with the data "Auditory Speech and Noise" dataset. Upgrading
mne
,mne_bids
andmne_nirs
from the latest official versions to the dev versions did not resolve the issue.If the
channels.tsv
file in the subjects directory is renamed (channels.tsvxxx
), the data can be read.Steps to reproduce
Expected results
I expect the data to be read.
Actual results
Additional information