Is it because there is a missing electrode position in src/config.py

FriedaSmith commented 1 year ago

Hi, I use the code below to load the EEG data with a dimension of 105, but your repository src/config.py says there are 104 electrode positions. I'm confused, is it because there is a missing electrode position in src/config.py?

    file_name = f"./data/task3-TSR/Matlab_files/resultsZAB_TSR.mat"
    data = io.loadmat(file_name, squeeze_me=True, struct_as_record=False)['sentenceData']

    for j in range(len(data)):
        arr = data[j].mean_g2

src/config.py:

# EEG information
chanlocs = ['E2', 'E3', 'E4', 'E5', 'E6', 'E7', 'E9', 'E10', 'E11', 'E12', 'E13', 'E15', 'E16', 'E18', 'E19', 'E20',
            'E22',
            'E23', 'E24', 'E26', 'E27', 'E28', 'E29', 'E30', 'E31', 'E33', 'E34', 'E35', 'E36', 'E37', 'E38', 'E39',
            'E40',
            'E41', 'E42', 'E43', 'E44', 'E45', 'E46', 'E47', 'E50', 'E51', 'E52', 'E53', 'E54', 'E55', 'E57', 'E58',
            'E59',
            'E60', 'E61', 'E62', 'E64', 'E65', 'E66', 'E67', 'E69', 'E70', 'E71', 'E72', 'E74', 'E75', 'E76', 'E77',
            'E78',
            'E79', 'E80', 'E82', 'E83', 'E84', 'E85', 'E86', 'E87', 'E89', 'E90', 'E91', 'E92', 'E93', 'E95', 'E96',
            'E97',
            'E98', 'E100', 'E101', 'E102', 'E103', 'E104', 'E105', 'E106', 'E108', 'E109', 'E110', 'E111', 'E112',
            'E114',
            'E115', 'E116', 'E117', 'E118', 'E120', 'E121', 'E122', 'E123', 'E124']

norahollenstein commented 1 year ago

If I remember correctly, we decided to remove the values from the reference electrodes. However, this did not have a big impact on the results. @samuki could you confirm this?

FriedaSmith commented 1 year ago

I read the paper Zurich Cognitive Language Processing Corpus: A simultaneous EEG and eye-tracking resource for analyzing the human reading process, which states that One hundred and five EEG channels were used for scalp recordings and nine EOG channels were used for artifact removal. The rest of the channels lying mainly on the neck and face were discarded before data analysis. 105 EEG channels should not correspond to 105 electrode positions? @norahollenstein

samuki commented 1 year ago

Hi @FriedaSmith, Thanks a lot for reaching out! I think the electrode labels in the config file, are the correct labels used for the analysis. As described in The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data the following 24 electrode labels were removed: E1, E8, E14, E17, E21, E25, E32, E48, E49, E56, E63, E68, E73, E81, E88, E94, E99, E107, E113, E119, E125, E126, E127, and E128, leading to the electrode labels specified in the config. This is different from the paper Zurich Cognitive Language Processing Corpus: A simultaneous EEG and eye-tracking resource for analyzing the human reading process.

FriedaSmith commented 1 year ago

Thank you for your patient answer @samuki. But I still have some confusion. In the paperZurich Cognitive Language Processing Corpus: A simultaneous EEG and eye-tracking resource for analyzing the human reading process , there is a statement:

The discarded electrode labels were E1, E8, E14, E17, E21, E25, E32, E48, E49, E56, E63, E68, E73, E81, E88, E94, E99, E107, E113, E119, E125, E126, E127, and E128. Additionally, 10 EOG electrodes were separated from the data and not used for further analysis, yielding a total number of 105 EEG electrodes. Subsequently, the data was converted to a common average reference.

...

Finally, for each sentence as well as for each word within each sentence, and for each frequency band, the EEG features consist of a vector of 105 dimensions (one value for each EEG channel).

Why can there still be 105 electrodes after removing 24 electrodes from 128 channels?

norahollenstein commented 1 year ago

Hi again, There seems to be a confusion between papers. The paper you are referring to has a different title than the one you are in you last comment. You are citing from "The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data". And yes, you are correct, there is a mistake in the paper. It should be 104 electrodes as is stated here in the code. The list of disregarded electrodes is correct (24 discarded electrodes). Thanks for catching this mistake!

I hope this clears it up @FriedaSmith ?

FriedaSmith commented 1 year ago

Thank you very much @norahollenstein , I cited the wrong paper for the above question.

In the paper The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data , there is a statement:

The discarded electrode labels were E1, E8, E14, E17, E21, E25, E32, E48, E49, E56, E63, E68, E73, E81, E88, E94, E99, E107, E113, E119, E125, E126, E127, and E128. Additionally, 10 EOG electrodes were separated from the data and not used for further analysis, yielding a total number of 105 EEG electrodes. Subsequently, the data was converted to a common average reference.
...

Finally, for each sentence as well as for each word within each sentence, and for each frequency band, the EEG features consist of a vector of 105 dimensions (one value for each EEG channel).

That is to say, in paper ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading, 105 electrodes were used, and in The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data, 104 electrodes were used. Is the electrode used in paper ZuCo, a simultaneous EEG and eye-tracking resource for natural sentence reading and paper ZuCo 2.0: A dataset of physiological recordings during natural reading and annotation the same? Can you provide the positions of the electrodes used in ZuCo1.0 and ZuCo2.0?

FriedaSmith commented 1 year ago

In addition, I have another question. I loaded the resultsXDT.mat file (which is the data used in paper The ZuCo benchmark on cross-subject reading task classification with EEG and eye-tracking data and found that the dimension is still 105.

    file_name = f"./data/ZuCo_helout/dropbox/resultsXDT.mat"
    # data = io.loadmat(file_name, squeeze_me=True, struct_as_record=False)['sentenceData']

    f = h5py.File(file_name, 'r')
    # print('keys in f:', list(f.keys()))
    sentence_data = f['sentenceData']
    mean_g2_objs = sentence_data['mean_g2']

    rawData = sentence_data['rawData']
    print("rawData len:",len(rawData))
    for idx in range(len(rawData)):
        mean_g2=np.squeeze(f[mean_g2_objs[idx][0]][()])

mean_g2

norahollenstein commented 1 year ago

Hi @FriedaSmith , we just noticed that, too. We will look into it and get back to you!

About ZuCo 1.0 and 2.0: Yes, they are the same. You can find the positions of the channels if you google the Hydrocel Geodesic Sensor Net with 128 channels.

FriedaSmith commented 1 year ago

Thank you very much @norahollenstein. I would like to know which electrode positions were removed by ZuCo 1.0 and 2.0. Can you tell me?

norahollenstein commented 1 year ago

Hi again @FriedaSmith ,

The following 24 electrode positions were removed in both ZuCo 1.0 and 2.0: E1, E8, E14, E17, E21, E25, E32, E48, E49, E56, E63, E68, E73, E81, E88, E94, E99, E107, E113, E119, E125, E126, E127, and E128. However, the recorded data contains 128 electrodes from the cap plus a CZ reference electrode. The data was then average referenced, so we have data values in a total of 129 channels. And then the 24 channels listed above were removed. Hence, when we used 105 electrodes it is 128 + 1 - 24.

FriedaSmith commented 1 year ago

Thank you so much for answering my questions! @norahollenstein

norahollenstein / zuco-benchmark

Is it because there is a missing electrode position in src/config.py #1