NDCLab / pepper-pipeline

tool | Python Easy Pre-Processing EEG Reproducible Pipeline
GNU Affero General Public License v3.0
3 stars 3 forks source link

Modify dev-feature-io to write electrodes.tsv and coordsystem.jason files #18

Closed georgebuzzell closed 3 years ago

georgebuzzell commented 3 years ago

Currently, dev-feature-io writes all bids files except for the electrodes.tsv and coordsystem.jason files. we need to modify the section of code writing to bids to also write these two files. This may or may not be an option in mne-bids, if not, we may have to manually create these. Please comment on here with suggestions, etc.

image

EEG-bids specification paper: https://www.nature.com/articles/s41597-019-0104-8

georgebuzzell commented 3 years ago

This issue is still relevant, but to implement these changes, please refer to the latest comments (and code) in https://github.com/NDCLab/baseEEG/issues/16

@SDOsmany @DMRoberts @yanbin-niu

georgebuzzell commented 3 years ago

This issue is no "merged" with this other primary issue: https://github.com/NDCLab/baseEEG/issues/16

No need to work on the current issue (#18) independently of #16

@DMRoberts

yanbin-niu commented 3 years ago

@DMRoberts I just read this (https://mne.tools/dev/auto_tutorials/intro/plot_40_sensor_locations.html#sphx-glr-auto-tutorials-intro-plot-40-sensor-locations-py), so I think the following code should work :

montage_file = mne.channels.make_standard_montage('GSN-HydroCel-129')
raw_129 = raw.copy().set_montage(montage_file)

I run into an error "ValueError: DigMontage is only a subset of info. There are 1 channel position not present in the DigMontage. The required channels are:['E129']."

I double checked the GSN-HydroCel-129.sfp, and last electrode name is "Cz". But, our data has "E129". I think this is why it threw the error. I tried either editing "GSN-HydroCel-129.sfp" or changing Channel name in our raw data. But it just did not work.

I also tried to use on_missing parameter: raw_129 = raw_129 = raw.copy().set_montage(montage_file, on_missing = 'warn') it did write electrodes.tsv and coordsystem.jason. But it missed the E129 info.

image image

Do you have any thoughts on this?

DMRoberts commented 3 years ago

@yanbin-niu , good question.

@georgebuzzell Do you know the model of cap used for the experiment? As noted by @yanbin-niu , the ‘GSN 129’ montage expects electrode Cz (which we don't have) and does not expect electrode E129 (which we do have). Is Cz labelled E129 here, or is that not the correct montage? GSN 129 is described as:

GSN-HydroCel-129 | HydroCel Geodesic Sensor Net and Cz (129+3 locations)

I see there is also a 'EGI_256' montage file, described as:

EGI_256 | Geodesic Sensor Net (256 locations)

Which seems to load without complaining that we only have 129 channels, however you mentioned that there are several EGI caps which use the same numbers in different locations, so I'm not sure if either of these are the correct cap used.

DMRoberts commented 3 years ago

@georgebuzzell I saw this comment in the MNE BIDS documentation. I think based on that, we probably should not write electrodes.tsv and coordsystem.json for this dataset, since the electrodes aren't digitized? We would still want to indicate the montage as @yanbin-niu noted above, though:

https://mne.tools/mne-bids/dev/auto_examples/convert_eeg_to_bids.html#convert-to-bids

Note

The electrodes.tsv and coordsystem.json files in BIDS are intended to carry information about digitized (i.e., measured) electrode positions on the scalp of the research subject. Do not (!) use these files to store “template” or “idealized” electrode positions, like those that can be obtained from mne.channels.make_standard_montage()!

georgebuzzell commented 3 years ago

@DMRoberts I am about 95% sure that the correct montage is indeed the GSN-129. What we need to do is simply change the name of the last channel from "CZ" to "E129". Does that make sense?

georgebuzzell commented 3 years ago

@DMRoberts that is great regarding the electrodes.tsv and coordsystem.json files. Yes, I agree that we should not write those, in this case. I also agree that we still want to indicate the montage file, as you and @yanbin-niu indicated. Please do include it, as @yanbin-niu indicated, with the slight edited that I suggested (edit the GSN-129 montage file such that the last electrode is labeled E129 in the montage file, as opposed to Cz.

DMRoberts commented 3 years ago

@georgebuzzell I looked inside the .MFF directory, and the layout is listed there as 'HydroCel GSN 128 1.0' so that must be it. Rather than creating a new montage file with Cz changed to E129, maybe we should just rename the channel E129 to be indicated as Cz, so we can use what seems to be the standard montage file for this setup?

georgebuzzell commented 3 years ago

@DMRoberts that works too! Thanks!

yanbin-niu commented 3 years ago

@DMRoberts @georgebuzzell Sorry for being delayed on this (had a family day today). Yesterday I tried both: 1. changing montage file (Cz->E129); 2. changing channel name in the raw data (E129->Cz). Neither worked, so I left a comment. Sorry for not making this point clear.

However, just now I relooked at this issue. Actually both works: 1. we can change the montage file (I had multi mne installed, yesterday I changed a montage file that was not being used by my mne environment); 2. changing channel name in the raw data. To do this, we have to change two places:

raw.info['chs'][128]['ch_name']='Cz'
raw.ch_names[128]='Cz'

Just changing one of the places will not work (this is why I did not get it right yesterday).

I personally prefer the first way, since I think the raw data is complicated and I am worried that changing the channel names in some places will have unknown influences on other data. I am not sure whether this makes sense.

DMRoberts commented 3 years ago

@yanbin-niu no problem!

MNE has a rename_channels method to change a channel name, which should take care of changing the name in all the necessary places. After that we can use the montage you identified. It also looks like when we use a standard montage like the GSN Hydrogel 129, we can just specify it's name to the set_montage method, and don't need to do the intermediate step of create_montage:

# associate montage
raw.rename_channels({'E129': 'Cz'})
raw.set_montage('GSN-HydroCel-129')

I can add these lines to a branch I'll push later today, which also incorporates the other MFF to FIF to BIDS changes.

yanbin-niu commented 3 years ago

@DMRoberts I did not realize there is a rename_channels method! Then I think it looks great and makes sense!

yanbin-niu commented 3 years ago

@DMRoberts @georgebuzzell A little bit more thoughts on the point that Dan mentioned (about digitized electrodes). I did read that comments in the MNE BIDS documentation early on, but I did not quite understand it at that time. Now I think it just says that Do not use the electrodes.tsv and coordsystem.jason to indicate the template montage or standard electrode positions, rather, those files should indicate the "real" information about electrode positions on the scalp of the research subject.

Then I think our data may have the "real" information, and set_montage method would help us adjust / "standardize" those information based on the montage being used with our cap.

Before we set_montage, the chs data looks like this, for example for E1:

image

After we set_montage, the chs data looks like this, still for E1:

image

which is the data in the electrodes.tsv.

image

So I think we are good. (I just want to make sense of the comments in the mne BIDS documentation)

georgebuzzell commented 3 years ago

@DMRoberts thank you! Yes, please do add the lines of code as you suggest!

@yanbin-niu thank you for looking into this! One comment/clarification: it sounds like you are saying that we do have the real (individual) digitized locations for each participant and that we should therefore create a coordinate system file and an electrodes file? However, I dont think we actually have those (but please point out where they are if I missed them!). Thus, I actually agree with @DMRoberts that we do not need to write an electrodes file, nor a coordinate system file. We only need the montage file is all. However, please follow up if there is confusion on my end, or if you have additional questions.

Thanks again, to both of you!!

yanbin-niu commented 3 years ago

@georgebuzzell Thanks for the clarification ! I still do not quite understand.... I put them into several questions:

  1. Does "digitized locations" mean the actual, measured locations for each participant?

  2. If like you said, we do not have those "digitized locations". In mne.info documentation, it says 'loc' parameter refers to "Channel location. For MEG this is the position plus the normal given by a 3x3 rotation matrix. For EEG this is the position followed by reference position (with 6 unused). The values are specified in device coordinates for MEG and in head coordinates for EEG channels, respectively". so, do you mean the data in the following image is the reference position rather than the "digitized location"? Every participant has the exact same value? image

  3. After we set_montage, two files (electrodes.tsv and coordsystem.jason) will be automatically generated in our BIDS files (under eeg folder). So, I do not understand what did you mean by "we do not need to write an electrodes file, nor a coordinate system file".

No pressure to respond, or you can point out something I should read first.

DMRoberts commented 3 years ago

@yanbin-niu on

1, Digitized locations would mean actual measured locations on the participant, vs. just using the standard montage positions that would be constant across participants with the same montage.

3, I also see that mne-bids is outputting electrodes.tsv and coordsystem.json, after we added the montage information. This is confusing since the MNE BIDS documentation says those two files should only exist if a non-standard montage is used: https://mne.tools/mne-bids/dev/auto_examples/convert_eeg_to_bids.html#convert-to-bids . Maybe these datasets do have digitized electrodes? Or else we may be specifying something incorrectly.

georgebuzzell commented 3 years ago

@yanbin-niu

For (1) I agree with Dan here. When I say "digitized locations" I mean actually measuring the 3-dimensional coordinates of the electrodes, placed on an individual subject, relative to individually-measured head landmark locations. This is different than an estimate of where the electrodes are, which is what is given by the montage file.

For (2) I need to look into the mne.info documentation further to make sure I am saying the right thing here! :)

For (3) I agree that is confusing that the electrodes and coordinate system files are being generated. @DMRoberts and @yanbin-niu I would lean toward the assumption that we are specifying something incorrectly, as even if there were digitzed locatio0ns (whcih I dont think there are) I dont recall anyone referencing them in some way, so not sure how mne bids would have access to that information. I will look closer at both the MNE.info and MNE BIDS documentation.

@yanbin-niu Thank you for these questions! Encouraging us all to understand MNE at a deeper level, which is really helpful!!

yanbin-niu commented 3 years ago

@georgebuzzell @DMRoberts Thanks for the response !! I really learned a lot from it.

A little bit more thoughts on this:

  1. I think I got why the electrodes and coordinate system files are being generated. The "dig" attribute in the raw data was set after set_montage(before set_montage, the raw data did not have this attribute), and it refers to the digitized electrodes. When MNE can read values from the "dig" attribute, MNE would think those values are digitized electrodes and put them into the electrodes and coordinate files. That is being said, when we use set_montage to set a standard montage, we actually wrote standard values for digitized electrodes.

    image
  2. did you remember how we add .mff extension to the folder? If we open that folder, there are many .xml files. One of files is coordinates.xml. Do you think, by any chance, this might be digitized electrode positions? (if so, we can read this xml and set it to our raw data, according to https://mne.tools/stable/generated/mne.channels.read_dig_egi.html#mne.channels.read_dig_egi)

    image
DMRoberts commented 3 years ago

@yanbin-niu that's interesting - one thing that's strange is in that coordinates.xml file, at the bottom, there is:

    <acqTime>2006-03-01T09:33:00.000000000-05:00</acqTime>
    <acqMethod>An Average of Many Data Sets</acqMethod>
    <defaultSubject>true</defaultSubject>

I wonder if this is some sort of default EGI montage that is being interpreted by MNE as digitally scanned electrode positions? @georgebuzzell would a EGI recording without electrode digitization usually have that file?

georgebuzzell commented 3 years ago

@yanbin-niu and @DMRoberts Very helpful comments and insights from both of you here! Here is what I will do... I will double-check the documentation for the CMI data we are working with, as well as check an EGI file that I know does not have any digitized coordinates. I think that @DMRoberts is correct that this coordinates file is likely a default montage, however, I will confirm this.

georgebuzzell commented 3 years ago

@DMRoberts and @yanbin-niu I checked a separate mff file that I am certain did not have digitized electrodes and I can confirm that the same coordinates.xml file is present. It appears that this is indeed just a default digitization file, which is the same for all participants, and not actual digitizations for this individual participant.

https://docs.exbuilder.org/project-components/analyses

I think we now need to figure out why mne is writing the coordinate system and electrode files when we add the montage. I am not sure if it is pulling from this coordinates.xml file, or, if it is something we are doing incorrectly in mne.

yanbin-niu commented 3 years ago

@georgebuzzell From my understanding, mne wrote the coordinate system and electrode files because we used the set_montage method, which is supposed to be a method to set up actual digitizations. So the current coordinate system and electrode files are pulling from standard montage file - 'GSN-HydroCel-129', which is different from the default digitization file in the .mff.

If we do not want mne to generate the coordinate system and electrode files, we just do not set_montage for it. My question is,

  1. why we want to set up a standard montage file? Is this because we did not have digitized electrodes but we need this coordinate data later in the processing? so we set up a standard montage for it.
  2. So, I assume our problem is : we want to use the data from standard montage to do later processing when needed, BUT we do not want mne to write this standard montage into the coordinate system and electrode files, because this does not follow the MNE_BIDS standard way? (the coordinate system and electrode files are supposed to be actual digitizations)
DMRoberts commented 3 years ago

@yanbin-niu @georgebuzzell

Usually (for example in other software like EEGLAB), setting the montage associates the electrode names with electrode spatial (x,y,z) positions, so the data can be analyzed / plotted topographically.

I think set_montage is for more than just digitized montages, because the function accepts either a DigMontage object with digitized locations, or a string name that indicates the standard montage to use (ie 'GSN-HydroCel-129'), which is internally passed to get_standard_montage. It seems like it wouldn't make sense to allow people to request standard montages positions if the function was only for digitized montages: https://mne.tools/stable/generated/mne.io.Raw.html#mne.io.Raw.set_montage

The problem may be that once the montage is created, MNE doesn't seem to distinguish between standard and digitized montages. For example even creating a standard 10-20 system montage returns a DigMontage object:

montage = mne.channels.make_standard_montage('standard_1020')
<DigMontage | 0 extras (headshape), 0 HPIs, 3 fiducials, 94 channels>

So perhaps once the montage is created, the MNE-BIDS exporter can't tell if the montage is standard or digitized (unique to a participant)?

I do see now that the channel locations have been read in from the MFF on data import, before we even set the montage. So I think as @yanbin-niu mentioned we can probably skip setting the montage, which would also then skip the generation of electrodes.tsv and coordsystem.json . We could always add the montage at a later processing stage if another processing step requires it.

raw = mne.io.read_raw_egi(input_path, preload=True, verbose=True, exclude=[])
raw.plot_sensors()

Figure_1

yanbin-niu commented 3 years ago

@DMRoberts I agree that set_montage accepts a string name that indicates the standard montage, and should be for more than just digitized montages.

Why I guess set_montage is supposed to write digitized montages is because, after we use set_montage, the "dig" attribute will be written and this is why mne thinks it gets the actual digitization data and decides to generate the coordinate system and electrode files. The "dig" attribute is for actual digitization. https://mne.tools/stable/generated/mne.Info.html?highlight=info#mne.Info

So, I agree we can skip setting the montage and add the montage later if needed.

georgebuzzell commented 3 years ago

@DMRoberts and @yanbin-niu Perfect! Thank you for figuring this out! Yes, I definitely agree now, as well, that we can skip "set montage" for now, given that way, the coordsystem.json and electrodes.tsv files will NOT be automatically written.

@DMRoberts or @yanbin-niu can one of you please update dev-feature-io-dan to: 1) no longer use "set montage" and 2) to reference the new pheno file that I added? Once those changes are done, if you can please do a pull request to dev-feature-io I can test it out to confirm.

I do think we should move forward, at least for now, as suggested above. However, I still have a lack of clarity around where the standard montage is stored for the bids data. I checked the meta data files (vhdr, as well as eeg.jason and channels.tsv) and none of these list the montage. I can understand if the actual montage coordinates are not listed, but should there not at least be a location, somewhere in the BIDS metadata to at least specify the name of the appropriate standard montage file?

georgebuzzell commented 3 years ago

@DMRoberts and @yanbin-niu I just looked in the example file that we are using for the other features currently under development (the "matching pennies data set). It looks like the location of where we should store the NAME of the location file is in the _eeg.json file. See here:

image

So, to update the request for my prior comment, and to "finish" dev-feature-io can one of you please do the following:

update dev-feature-io-dan to: 1) no longer use "set montage" 2) to reference the new pheno file that I added 3) to write the montage name into the _eeg.json file

Once those changes are done, if you can please do a pull request to dev-feature-io I can test it out to confirm.

Thank you, thank you, thank you!!