NDCLab / pepper-pipeline

tool | Python Easy Pre-Processing EEG Reproducible Pipeline
GNU Affero General Public License v3.0
3 stars 3 forks source link

Discussion on coordinate values in the MNE #260

Open yanbin-niu opened 3 years ago

yanbin-niu commented 3 years ago

First, coordinate values are used to compute distances between electrodes and identify electrodes within the frontal and postural areas, which matter for both FASTER and ADJUST;

Second, eeglab uses values it reads from the montage file;

Third, MNE does extra conversion on values read from the montage file: 1) scaling:

image

2) transforming:

image

In sum, MNE converts coordinate values that are read from the montage file (looked everywhere for the reasons, but could not figure it out). So, my question is: should we consider to read coordinate values from the montage file or should we stick to the values provided by the MNE?

Appreciate any thoughts!

@DMRoberts @georgebuzzell

DMRoberts commented 3 years ago

@yanbin-niu when you mention that

eeglab uses values it reads from the montage file

Is the montage the "GSN_HydroCel_129.sfp" file?

I believe the section of code you highlighted in MNE is scaling the coordinate positions in the coordinate file relative to the participant head size, but there is a default value that is used if it's not provided.

Beyond the scaling, do the values seem the same between the two? I believe EEGLAB can use a few coordinate systems (polar, spherical, cartesian) though I'll have to check again tomorrow which is the default internally.

yanbin-niu commented 3 years ago

@DMRoberts Thanks for taking a look at this!

Yes, I mean, in eeglab, I can set channel locations by selecting the "GSN_HydroCel_129.sfp" file. I think this is how to set channel locations? Eeglab uses values in the "GSN_HydroCel_129.sfp" file as x, y and z (the cartesian coordinate), and calculate the polar, spherical coordinate values based on these.

When you said

I believe the section of code you highlighted in MNE is scaling the coordinate positions in the coordinate file relative to the participant head size

do you mean the first one I highlighted or the second one? For the first one, MNE uses 95mm as the headsize and scale = 95mm / np.median(np.linalg.norm(pos, axis=-1)). From what I can see, it seems that MNE does not read the participant's head size or pass the participant's head size when we set up the standard montage file (I think we probably can control this argument in other places?).

image image

So, for the first one, 95mm is used to divide np.median(np.linalg.norm(pos, axis=-1)). From here, the coordinate values already look different.

georgebuzzell commented 3 years ago

Thanks @DMRoberts

@yanbin-niu yes, as @DMRoberts stated, the values in mne are produced by scaling the montage file. However, also as noted. It appears that default head parameters are used for all participants unless otherwise provided.

I am not sure if eeglab scales the data at all or just uses the raw montage values.

Ultimately, we want to use the values that mne computes after scaling the montage. In the future, we can further test whether we obtain better results when actually providing realistic head parameter values. E.g. this may help adjust perform better for child data. But for now. Default are fine.

In sum, please move forward with using the scaled values that mne computes.

yanbin-niu commented 3 years ago

@georgebuzzell Thanks for commenting!

I am not sure if eeglab scales the data at all or just uses the raw montage values.

-- Eeglab does not scale values that are read from the "GSN_HydroCel_129.sfp" file (from what I can see).

In sum, please move forward with using the scaled values that mne computes.

-- MNE only provides x, y and z (the cartesian coordinate). I tried to calculate the polar coordinates based on the x, y and z provided, since we want to use the values for theta and radius to identify the channel areas. But, the results are different (since the x, y and z are already different.). This means, when we implement ADJUST, the channels that would be identified as falling in right eye, left eye areas, frontal and back areas are slightly different .

georgebuzzell commented 3 years ago

@yanbin-niu yes, they will be slightly different because the standard montage has been scaled to the head parameters. This is correct behavior.

I think I see your point though... ultimately, we WILL most likely want to use these scaled values (converted back to polar). But, for now, to allow ease of comparison to the prior matlab implementation, I agree that we should actually just use the raw values to match. Once we have faster working this way, with the raw values, we should seek to optimize our implementation further via use of the scaled values (and scaling for different head sizes).

@yanbin-niu does that all make sense? Do you agree?

DMRoberts commented 3 years ago

@yanbin-niu @georgebuzzell Looking at this a bit more:

The EGI .sfp location file contains the electrode positions in cartesian (X,Y,Z) coordinates: http://fcon_1000.projects.nitrc.org/indi/cmi_healthy_brain_network/File/_eeg/GSN_HydroCel_129.sfp

EEGLAB also displays cartesian coordinates, however the frame of reference for the cartesian coodinates differs between the EGI file and EEGLAB, so EEGLAB transforms them on import from the SFP. From the readlocs.m documentation: https://github.com/sccn/eeglab/blob/develop/functions/sigprocfunc/readlocs.m

%   '.xyz': 
%               Matlab/EEGLAB Cartesian coordinates. Here. x is towards the nose, 
%               y is towards the left ear, and z towards the vertex. Note that the first
%               column (x) is -Y in a Matlab 3-D plot, the second column (y) is X in a 
%               matlab 3-D plot, and the third column (z) is Z.
%               Fields:   channum   x           y         z     label
%               Sample:   1       .950        .308     -.035     Fp1
%                         2       .950       -.308     -.035     Fp2
%                         3        0           .719      .695    C3
%                         4        0          -.719      .695    C4
%                           ...
%   '.sfp': 
%               BESA/EGI-xyz Cartesian coordinates. Notes: For EGI, x is toward right ear, 
%               y is toward the nose, z is toward the vertex. EEGLAB converts EGI 
%               Cartesian coordinates to Matlab/EEGLAB xyz coordinates. 
%               Fields:   label   x           y          z
%               Sample:   Fp1    -.308        .950      -.035    
%                         Fp2     .308        .950      -.035  
%                         C3     -.719        0          .695  
%                         C4      .719        0          .695  

I believe MNE also transforms the values to a new frame of reference, so neither EEGLAB nor MNE would match the .SFP once imported. Since the coordinate systems don't match between EEGLAB and MNE, it seems that we might want to check how the coordinates are being used in Adjust or FASTER, and change them accordingly?

yanbin-niu commented 3 years ago

@DMRoberts Thanks for bringing this up! Sorry I totally forgot about this (it's been a while). You are right that the EEGlab changes signs and exchanges the x and y values. But before your explanation, I did not know the reason. Thanks Dan.

I am pretty sure the coordinates being used in Adjust and FASTER are EEG.chanlocs. So, if the user set channel locations up by using pop_chanedit, the coordinates used in Adjust and FASTER are "EEGlab way". And I think, in most cases, people would set channel locations up during preprocessing?

yanbin-niu commented 3 years ago

Comments from Slack for future reference:

Coordinates values and channel locations from MNE (MNE only provides x, y and z. theta and radius are computed by me): image image

Coordinates values and channel locations from eeglab: image image

George and Yanbin's chat history:

George Buzzell 2 hours ago Oh, this is because it is mapping the montage file onto the reference frame fo mne, and different reference frames have different conventions for left vs right. Regardless. Unless u disagree, I think we should do now ignore the internal values and just use the raw values. This allows easier comparison with matlab based adjust and faster. Then, once we are at least performing as well, we can revisit use of the internal scaled values. Agree?

Yanbin Niu 2 hours ago @George Buzzell if using the raw values is just for the comparison. I think we do not have to. since I have already done the comparsion. I think the algorithms now are exactly same.

Yanbin Niu 2 hours ago I saved the activation matrix, mixing matrix and channel locations as raw data. and work from those exact same data from both python and matlab. I got exactly same results for every step. (forget to mention there might be a very minor issue in eeglab adjust. - if you are interested, I can send you later)

George Buzzell 2 hours ago Ok, great! Basically, more testing is needed before updating to include the internal values. So, I would suggest "finishing" a version .1 of your implementation of adjust, and fully integrate into the pipeline and document, all using the raw values. Then, work on the other "critical" updates we still have to make. Then. Only once we have all parts of the pipeline at .1, and all the testing suite set up, etc, then we can start upgrading and testing improvements for particular features, one would be upgrading to the internal values

yanbin-niu commented 3 years ago

With @DMRoberts 's comments, now I realize that if we decide to go with eeglab's frame of reference, we might need sightly different mapping codes for different EEG systems. But sounds doable since it does not change a lot from the original montage file at least for EGI .sfp file (only change directions). So we want to do this for EGI .sfp file for now, and add more (other systems) in the future? @georgebuzzell

DMRoberts commented 3 years ago

@yanbin-niu I think that it might be confusing in the long run to try to map everything back to the EEGLAB style coordinate. If possible we could just make a clean break from the EEGLAB coordinates and re-implement the logic that FASTER or Adjust use for electrode distance, but with the MNE frame of reference? I think despite the change of reference, the relative position between electrodes should be the same or close to the same.

Would you post a snippet of what Adjust and/or FASTER do with the EEGLAB location data?

yanbin-niu commented 3 years ago

@DMRoberts Sure. Basically, Ajust and FASTER use distances between electrodes by sqrt(x**2+y**2+z**2). Plus, Adjust uses theta and radius values (the polar coordinate) that are computed from x, y and z to find electrodes falling in Frontal Area, Posterior Area, Left and Right area. See below:

image image

I agree that ideally, we want to use the MNE style coordinate. Since EEGlab only changes directions, computing distances and the polar coordinate should not be different (compared with the EGI .sfp file). But MNE scales x, y and z values, so, distances computing would be different (compared with the EGI .sfp file). Is there a way to adapt to the MNE style coordinate? Or MNE style coordinate is actually more accurate/preferable?

DMRoberts commented 3 years ago

@yanbin-niu as you mentioned the swapping of axes and axis direction between the .sfp and EEGLAB coordinates doesn't affect the measure of distance.

The measure of distance itself is the euclidean norm, so after subtracting x,y,z from x,y,z we can use this numpy function for the euclidean norm which should be faster than squaring, summing, and taking the square root as done in the FASTER code:

np.linalg.norm()

The scaling also doesn't effect the sorting of distances, since that is all relative.

I think all that we need to do is adjust the 'topografie' weight matrix to be equivalent in units between the two. In MNE, I see that with the default head radius ('head_size') of 9.5 centimeters (.095 meters), the .sfp values are all scaled by a constant value of 0.010827. That scaling factor is computed via:

  scale = head_size / np.median(np.linalg.norm(pos, axis=-1))
        for value in ch_pos.values():
            value *= scale

From https://github.com/mne-tools/mne-python/blob/977397f575f8f37177bb62a05d272f2aa1874a6e/mne/channels/_standard_montage_utils.py#L172

I think what is happening here is that np.median(np.linalg.norm(pos, axis=-1) finds the radius of the head in the SFP coordinates, and the scaling factor is then calculated that will adjust the radius to be equal to .095 meters. I believe the scaling factor is so small (0.010827) mainly because MNE coordinates are represented in meters, but I believe the SFP file is in centimeters. If both were in the same units, the scaling factor would only be 1.08.

If this is the case, then I think the only adjustment we might have to make is adjusting the weights in 'topografie' to be in meters, rather than centimeters, by dividing them by 100? I haven't looked at what the 'topografie' values actually look like.

@georgebuzzell do you happen to know what the units of the channel locations in EEGLAB are? I had never had to deal w/ that, but I thought you might have when integrating EEG with fMRI. Is my assumption that the channel locations in EEGLAB are in cm correct?

DMRoberts commented 3 years ago

On the identification of where the electrodes are, it looks like ADJUST is using the 2D polar representation of the electrodes (theta and radius). In MNE there is a function to convert the cartesian coordinates to the (3D) spherical polar representation. I think we could either identify a method to convert to the 2D polar representation as well, or identify where the various points of interest are in spherical polar space.

An example of converting to spherical polar space (there may be a more efficient way to do this):

import mne
import numpy as np

montage = mne.channels.make_standard_montage('GSN-HydroCel-129')
channel_names = montage.ch_names
num_channels = len(channel_names)
cartesian_locations = np.empty([num_channels, 3])

for idx, c_name in enumerate(channel_names):
    cartesian_locations[idx, :] = montage.get_positions()['ch_pos'][c_name]

spherical_locations = mne.transforms._cart_to_sph(cartesian_locations)
yanbin-niu commented 3 years ago

@DMRoberts

  1. Using np.linalg.norm() is a great idea to compute distances;

  2. Huge thanks for bringing up reading values from the montage instance. I was using the values from raw.info['chs'] and I thought it has same values with the values from montage.get_positions()['ch_pos'] (but in reality, no). Values in raw.info['chs'] go through a further transforming step besides the scaling code you mention. Now, I agree we can use the x, y and z values from MNE to compute distances (after dividing them by 100).

  3. Big thanks for providing the _cart_to_sph method in MNE. Then I think it should not be an issue to get theta and radius in 2D polar system. Since technically speaking, azimuth should be the theta and radius can be computed from sqrt(x^2 + y^2), right?

However, further steps are need to implement the logic that Adjust uses to identify where the electrodes are. Specifically, in eeglab, x is toward nose; y is toward left ear; z is toward vertex. But if we use the montage instance, x is toward right ear; y is toward nose; z is toward vertex. So,

a) for frontal area (in green), the original angular range is 0 < |a| < 60 in eeglab. In MNE, we should change to 30 < a < 150, right? b) for posterior are (in blue), the original angular range is 110 < |a| < 180 in eeglab. In MNE, we should change to -150< a < -30, right? c) for left eye are (in yellow), the original angular range is -61 < a < -29 in eeglab. In MNE, we should change to 119 < a < 151, right? d) for right eye are (in purple), the original angular range is 29 < a < 61 in eeglab. In MNE, we should change to 29 < a < 61, right?

image

Furthermore, radius ranges are also needed to change. I am confused by the unit being used in the ADJUST paper, for example:

in the frontal area (FA) (radial range: 0.4 < r < 1; angular range from medial line: 0 < |y| < 60

I checked the reallocs (https://github.com/sccn/eeglab/blob/develop/functions/sigprocfunc/readlocs.m). it says:

head disk radius is 0.5

So, I assume the unit mentioned in the paper, say 0.4, should be same with the 0.5 that is mentioned in the readlocs? Does this mean, we can use, approximately, 9.5 cm * 0.4/0.5 as the range? (just approximate, since 9.5 is the spherical radius not the disc radius)

DMRoberts commented 3 years ago

@yanbin-niu , on:

2. Huge thanks for bringing up reading values from the montage instance. I was using the values from raw.info['chs'] and I thought it has same values with the values from montage.get_positions()['ch_pos'] (but in reality, no). Values in raw.info['chs'] go through a further transforming step besides the scaling code you mention. Now, I agree we can use the x, y and z values from MNE to compute distances (after dividing them by 100).

After you mentioned raw.info['chs'], I checked there as well, and what is strange is that some of the coordinates, such as X, and Y for many of the electrodes, have nearly identical values as in the montage object. But the Z values are different. I imagine the reference point is different between the two, but I don't quite understand the distinction between raw.info['chs'] and the montage yet. You might be correct that the raw.info['chs'] are the ones to use ( I saw you can get the coordinates from info['chs'] via raw._get_channel_positions() ).

I'll look into this some more, and also the question about identifying the electrode areas of interest.

yanbin-niu commented 3 years ago

Thank you @DMRoberts ! It seems related to converting into the neuroimage head coordinate system. The values in the raw.info['chs'] go through another transformation matrix that maps an RAS coordinate system to the MNE head coordinate system, in which the x axis passes through the two preauricular points, and the y axis passes through the nasion)

https://github.com/mne-tools/mne-python/blob/maint/0.23/mne/channels/montage.py#L420

Not really understand why they do this and how this neuroimage head coordinate system works. But, as you mentioned early on, I think we can use montage.get_positions()['ch_pos'] for now? (rather than raw.info['chs'])

DMRoberts commented 3 years ago

@yanbin-niu I think for the distance measure, it won't matter if we used the montage coordinates or the raw.info coordinates, since the relative positions between electrodes will be the same. It might be easier to use the raw.info coordinates since they can be easily accessed a numpy array via raw._get_channel_positions(). I believe then multiplying those distance values by 100 to transform from meters to centimeters should give us distance values equivalent to ADJUST.

yanbin-niu commented 3 years ago

@DMRoberts That's a good point!

DMRoberts commented 3 years ago

@yanbin-niu on your question about the units of the EEGLAB 2D radius, I'm not sure why 0.5 is used for the head disk radius. I tried to think about the conversions above, however I thought actually maybe it is best after all to recreate the EEGLAB coordinates first, and then simplify the process later.

In terms of locating the regions of interest, I believe in EEGLAB we have:

X,Y,Z: Cartesian coordinates of electrodes in 3D space theta, radius: Spherical coordinates of electrodes in 2D topographic space sph_theta, sph_radius, sph_phi: Spherical coordinates of electrodes in 3D space.

In EEGLAB, I believe the 2D topographic coordinates are generates from the spherical coordinates via the sph2topo function: https://github.com/sccn/eeglab/blob/develop/functions/sigprocfunc/sph2topo.m , namely:

  angle  = -horiz;
  radius = 0.5 - az/180;

I attempted to recreate these values from the MNE arrays:

import mne
import numpy as np
import matplotlib.pyplot as plt

# simulate an EEG dataset
channel_names = [f'E{c}' for c in range(1, 129)]
channel_names.append('Cz')
num_channels = len(channel_names)
num_samples = 10000
eeg_data = np.random.randn(num_channels, num_samples)
info = mne.create_info(channel_names, ch_types='eeg', sfreq=500)
info['line_freq'] = 60
raw = mne.io.RawArray(eeg_data, info)
raw.set_montage('GSN-HydroCel-129')
raw.set_eeg_reference(ref_channels=['Cz'], ch_type='eeg')

# get channel positions, convert to EEGLAB style cartesian coordinates
cartesian_locations = raw._get_channel_positions()
cartesian_locations[:,[0,1]] = cartesian_locations[:,[1,0]]
cartesian_locations[:,1] = cartesian_locations[:,1] * -1

# convert to spherical coordinates
spherical_locations = mne.transforms._cart_to_sph(cartesian_locations)

mne_theta = -1 * np.rad2deg(spherical_locations[:,1])
mne_radius = ((np.rad2deg(spherical_locations[:,2])) / 180)

For the radius, it seemed that we didn't have to subtract the value from 0.5, perhaps due to a coordinate difference between the EEGLAB and MNE spherical coordinates.

The results are close but not exactly the same as the EEGLAB values (here EEGLAB_Chanlocs.csv is an export of the EEGLAB chanlocs structure from EEGLAB).

eeglab_data = pd.read_csv('EEGLAB_Chanlocs.csv')

fig = plt.figure()
ax = plt.subplot(1, 2, 1)
plt.scatter(eeglab_data['radius'], mne_radius)
plt.axis('square')
plt.xlabel('EEGLAB Radius')
plt.ylabel('MNE Computed Radius')
plt.title('Radius')
ax.grid()

ax2 = plt.subplot(1, 2, 2)
plt.scatter(eeglab_data['theta'], mne_theta)
plt.axis('square')
plt.xlabel('EEGLAB Theta')
plt.ylabel('MNE Computed Theta')
plt.title('Theta')
ax2.grid()

Theta_Radius