EGI .raw file import - ch_names/montage file incompatibility

espressofiend commented 7 years ago

When I try to import EGI .raw files using mne.io.read_raw_egi, I encounter a problem reading the GSN-HydroCel montage files. The problem arises from the fact that, on line 252 of egi.py, ch_names are applied as 'EEG %03d' % (i + 1). So, you getEEG 001, EEG 002, ...

However, in the GSN-HydroCel montage files, the electrodes are listed in the form E1, E2,... - so MNE doesn't find any labels in the montage file that match the labels in the data.

I tried changing the montage files to EEG 001, ... but this still doesn't work, because the montage routine apparently does not like spaces.

So, the solution I have found is to modify egi.py to assign ch_names without spaces (simply removing the space in line 252 so you have 'EEG%03d' % (i + 1)), and also modify the GSN-HydroCel montage files to use this same labeling scheme.

It's a pretty minor fix which I would suggest making, but I'm also curious among those who developed this whether this was tested and somehow worked for them, or if it was just something that no one encountered because no one tried it? I don't see that it could be anything about how the channels are specified in EGI's .raw files, since the ch_names are being assigned in egi.py.

I've consulted with a colleague in another lab and get the exact same problem trying to import her data, with the same solution working.

Thanks in advance for your thoughts!

agramfort commented 7 years ago

please share a file so we can easily come up with a bug fix.

I personally think that renaming the electrodes is a bad idea in the end...

jaeilepp commented 7 years ago

AFAIK the egi file format does not contain the channel names, so the names are 'guessed' anyway. Maybe changing this line to _check_update_montage(info, montage, update_ch_names=True) would fix it. @espressofiend can you try?

mmagnuski commented 7 years ago

I also always have to change channel names to correct ones (E1, E2, etc.) to be able to apply montage. Another thing I noticed is that the digital input (DIN) LPT events are not read correctly into stim channel (but DIN1, DIN2, DIN4 etc. channels are ok) But I would have to dive into this a little more and then I'll open a separate issue.

espressofiend commented 7 years ago

Thanks everyone. @jaeilepp unfortunately your suggestion did not work. It returns:

/Users/aaron/dev/mne-python/mne/io/utils.py in _mult_cal_one(data_view, one, idx, cals, mult)
     34     else:
     35         if isinstance(idx, slice):
---> 36             data_view[:] = one[idx]
     37         else:
     38             # faster than doing one = one[idx]

ValueError: could not broadcast input array from shape (793,63131) into shape (132,63131)

...which stems from the fact that each unique event code gets imported as a separate channel. These are all combined into STI 014, which gets added as the very last channel in the array, but you do get a bunch of "extra" channels which seem to interfere with mapping the montage using @jaeilepp's suggestion. Even beyond that, I would anticipate problems because my data sets have 129 channels, but the montage file specifies three fiducial channels as well (hence the 132 in the above error message).

On the channel front, I will note that E1, E2,... is the EGI convention for labelling their channels. So it does make sense to retain that naming scheme; a simpler fix would be to just change line 252 of egi.py to ch_names = ['E%d' % (i + 1) for i in - I had kept the "EEG" plus leading zeros format for visual elegance, really (and initially, I didn't know if there was any other independent reason for using this naming, internal to MNE. But I don't see that there is). I've tried this 'E%d' fix and it works fine with the existing montage files.

@agramfort I'm happy to share the files, but the individual .raw files are > 100 Mb and collectively the data from one subject are over 1 Gb. GitHub won't allow me to attach them - is there a preferred other way to share?

There's actually a related issue with the codes that I might as well bring up now, since @mmagnuski raised this and it is relevant to channel mapping (but doesn't interfere with the montage assignment). In this study, we used the TCP/IP mode of sending codes, so the codes show up as 4-character strings, not DINx (which I believe is what you get using TTL triggers). As I noted, each code gets imported as a separate channel. In our study, we plan on doing item-level analyses with linguistic stimuli, so we have hundreds of unique event codes. This is not really an issue, except that different segments of data will have different codes, and even different numbers of codes. Thus each .raw file segment has a different number of channels (the same 129 EEG channels, but then different numbers/identities of event code channels). So prior to running mne.concatenate_raws, I have to run

for i in range(0,len(raw_files)):
    [raw_files[i].drop_channels([raw_files[i].ch_names[j] 
                                for j in range(129, len(raw_files[i].ch_names)-1)])]

This preserves the last channel, STI 014, which has all the event codes in it (apparently contra @mmagnuski 's experience; all I can say there is that it works for me :). This works absolutely fine so I'm not sure that there's a particular need for changes to MNE in this case.

larsoner commented 7 years ago

On the channel front, I will note that E1, E2,... is the EGI convention for labelling their channels.

It sounds like it would be a bugfix, then, to use these channel names. But I wonder if it would break a lot of people's code. We could have channel_naming='E%d' by (new, backward-incompat bugfix) default, and tell people they can do channel_naming='EEG %03d' to go back to old behavior. WDYT?

agramfort commented 7 years ago

Aaron put the file on dropbox / google drive and share the link with us.

espressofiend commented 7 years ago

@Eric89GXL I think that's a suitable bugfix - anyone who made it work before was necessarily doing some sort of workaround. I guess it's hard to speculate how it would break people's code - since the labels being assigned are incompatible with the montage files, I imagine that people are either relabelling the channels after import and before assigning the montage (as @mmagnuski indicated), or have hacked the egi.py code or the montage files. At any rate, it seems kind to preserve the option of having the old behaviour, at least for a while.

Here's the link to the data: https://drive.google.com/open?id=0B_DSuQ5M_LbXNV9uUDZ0UHRfazQ

mne-tools / mne-python

EGI .raw file import - ch_names/montage file incompatibility #4076