vlawhern / arl-eegmodels

This is the Army Research Laboratory (ARL) EEGModels Project: A Collection of Convolutional Neural Network (CNN) models for EEG signal classification, using Keras and Tensorflow
Other
1.14k stars 284 forks source link

Problem with the dataset BCI IV 2a : Result Doesn't match the paper's one. #7

Closed mouadriyad closed 5 years ago

mouadriyad commented 5 years ago

Hi,

I tried to recreate the result for EEGNET-4-1 to use it as a baseline in cross-subject case. The problem is that I couldn't get the same result where simulation gets higher Accuracy around 50 and 60 (in the paper 40).

I reused completely the code from braindecode as data preprocessing code and I add the resampling function as in the paper and It not working.

If someone can check the code below and look for something I missed it would be great.

The Preprocessing part:

path='/home/user'
import logging
import os.path
import time
from collections import OrderedDict
import sys

import numpy as np

from braindecode.datasets.bcic_iv_2a import BCICompetition4Set2A
from braindecode.experiments.monitors import LossMonitor, MisclassMonitor, \
    RuntimeMonitor
from braindecode.experiments.stopcriteria import MaxEpochs, NoDecrease, Or
from braindecode.datautil.iterators import BalancedBatchSizeIterator
from braindecode.datautil.splitters import split_into_two_sets
from braindecode.mne_ext.signalproc import mne_apply,resample_cnt
from braindecode.datautil.signalproc import (bandpass_cnt,
                                             exponential_running_standardize)
from braindecode.datautil.trial_segment import create_signal_target_from_raw_mne
X_train_set=list()
y_train_set=list()
X_test_set=list()
y_test_set=list()

for subject_id in list(range(1,10)):
    data_folder= path
    low_cut_hz=4
    ival = [500, 2500]
    high_cut_hz = 40
    factor_new = 1e-3
    init_block_size = 1000

    train_filename = 'A{:02d}T.gdf'.format(subject_id)
    test_filename = 'A{:02d}E.gdf'.format(subject_id)
    train_filepath = os.path.join(data_folder, train_filename)
    test_filepath = os.path.join(data_folder, test_filename)
    train_label_filepath = train_filepath.replace('.gdf', '.mat')
    test_label_filepath = test_filepath.replace('.gdf', '.mat')

    train_loader = BCICompetition4Set2A(
        train_filepath, labels_filename=train_label_filepath)
    test_loader = BCICompetition4Set2A(
        test_filepath, labels_filename=test_label_filepath)
    train_cnt = train_loader.load()
    test_cnt = test_loader.load()

    # Preprocessing

    train_cnt = train_cnt.drop_channels(['STI 014', 'EOG-left',
                                         'EOG-central', 'EOG-right'])
    assert len(train_cnt.ch_names) == 22
    # lets convert to millvolt for numerical stability of next operations
    train_cnt = resample_cnt(train_cnt, 128)### added by me as suggeste in the paper
    train_cnt = mne_apply(lambda a: a * 1e6, train_cnt)
    train_cnt = mne_apply(
        lambda a: bandpass_cnt(a, low_cut_hz, high_cut_hz, train_cnt.info['sfreq'],
                               filt_order=3,
                               axis=1), train_cnt)
    train_cnt = mne_apply(
        lambda a: exponential_running_standardize(a.T, factor_new=factor_new,
                                                  init_block_size=init_block_size,
                                                  eps=1e-4).T,
        train_cnt)

    test_cnt = test_cnt.drop_channels(['STI 014', 'EOG-left',
                                       'EOG-central', 'EOG-right'])
    assert len(test_cnt.ch_names) == 22
    test_cnt = resample_cnt(test_cnt, 128)### added by me as suggeste in the paper
    test_cnt = mne_apply(lambda a: a * 1e6, test_cnt)
    test_cnt = mne_apply(
        lambda a: bandpass_cnt(a, low_cut_hz, high_cut_hz, test_cnt.info['sfreq'],
                               filt_order=3,
                               axis=1), test_cnt)
    test_cnt = mne_apply(
        lambda a: exponential_running_standardize(a.T, factor_new=factor_new,
                                                  init_block_size=init_block_size,
                                                  eps=1e-4).T,
        test_cnt)

    marker_def = OrderedDict([('Left Hand', [1]), ('Right Hand', [2],),
                              ('Foot', [3]), ('Tongue', [4])])

    train_set = create_signal_target_from_raw_mne(train_cnt, marker_def, ival)
    test_set = create_signal_target_from_raw_mne(test_cnt, marker_def, ival)

    X_train_set.append(train_set.X)
    y_train_set.append(train_set.y)
    X_test_set.append(test_set.X)
    y_test_set.append(test_set.y)

subject='1' # the subject id as string
subject = int(subject)-1 # str to int
# shuffle part 
from random import shuffle
Lt = list(range(0, 9)) 
shuffle(Lt)
Lt.remove(subject)
Lv = Lt[0:3]
[Lt.remove(i) for i in Lv]
X_t = list()
X_v = list()
y_t = list()
y_v = list()
print(len(X_train_set))
for i in Lt:
    X_t.append(X_train_set[i])
    y_t.append(y_train_set[i])

for i in Lv:
    X_v.append(X_train_set[i])
    y_v.append(y_train_set[i])

X_train = np.concatenate(X_t, axis=0).astype('float32')

print(Lt)#list training set
print(Lv)#list testing set
print(len(X_t))
print(len(X_v))
print(X_train.shape)
print(y_train.shape)

X_train = np.concatenate(X_t, axis=0).astype('float32')
X_val = np.concatenate(X_v, axis=0).astype('float32')
X_test = X_test_set[subject].astype('float32')

y_train = np.concatenate(y_t, axis=0).astype('int16')
y_val = np.concatenate(y_v, axis=0).astype('int16')
y_test = y_test_set[subject].astype('int16')

y_train = y_train.reshape((288*5))
y_val = y_val.reshape((288*3))
y_test = y_test.reshape((288))
y_train = keras.utils.to_categorical(y_train, 4)
y_val = keras.utils.to_categorical(y_val, 4)
y_test = keras.utils.to_categorical(y_test, 4)
vlawhern commented 5 years ago

So the first thing is to try EEGNet-8,2, with kernLength = 32. This should get you around 66-68% overall accuracy with 4-fold blockwise cross-validation as reported in the paper for within-subject classification. The performance of cross-subject classification should be about 40% overall accuracy. I also used braindecode's preprocessing for BCI IV 2A so the code above should be OK (although I haven't looked at it closely).

mouadriyad commented 5 years ago

So the first thing is to try EEGNet-8,2, with kernLength = 32. This should get you around 66-68% overall accuracy with 4-fold blockwise cross-validation as reported in the paper for within-subject classification. The performance of cross-subject classification should be about 40% overall accuracy. I also used braindecode's preprocessing for BCI IV 2A so the code above should be OK (although I haven't looked at it closely).

I didn't understand the way you used in within-subject, did you mix the training set with the testing set (288x2) then split it into 4 blocks (288x2/4)?

vlawhern commented 5 years ago

I did 4-fold CV with the BCI IV 2A competition test set always being the test set. So take the training set, divide it into three equal contiguous partitions, select 2 of those 3 to be the training, 1 to be the validation, then set the test set to be the competition test set. Repeat this for all combinations of training/validation.

vlawhern commented 5 years ago

Here's a Dropbox link with the data I used for the paper, as well as some quick code to run the EEGNet model. The data was pre-processed and filtered according to the code in the braindecode repository, and I've already epoched the data at [0.5, 2.5]s. I've also included one run of the within-subject results. Note that due to the small sample size of BCI Competition IV you should expect variation from run-to-run.

https://www.dropbox.com/s/cxwyigcx4j6h8t5/EEGNet_BCI_IV.zip?dl=0

Hope this helps.

mouadriyad commented 5 years ago

Hi,

Thanks for your help, I use your data now and I merged a part of your code with mine for still for cross-subject. But, I got an average of 53% for almost the third of the 90 fold. For the within-subject, It works well and I got a similar result.

Could it come from the Tensorflow version? (I use Google Colab so the current version is 1.12.0 and I couldn't install the 1.9.0 due to library)

EDIT : I run a simulation with 30 fold (3 per subject to check) and from the first 16 fold (I use colab), it seems to me that the version 1.9.0 give low accuracy than the 1.12.0. Many trial doesn't learn (25% accuracy)

I recheck and if it's ok I'll close the issue.

mouadriyad commented 5 years ago

Here's a Dropbox link with the data I used for the paper, as well as some quick code to run the EEGNet model. The data was pre-processed and filtered according to the code in the braindecode repository, and I've already epoched the data at [0.5, 2.5]s. I've also included one run of the within-subject results. Note that due to the small sample size of BCI Competition IV you should expect variation from run-to-run.

https://www.dropbox.com/s/cxwyigcx4j6h8t5/EEGNet_BCI_IV.zip?dl=0

Hope this helps.

Hi, Which batch size did you use on cross-subject? It's the last parameter I need to test. I expect a high than 64

vlawhern commented 5 years ago

I used batchsize 64 for all experiments, both within- and cross- subject.

mouadriyad commented 5 years ago

I found a bug with my tensorflow configuration.Your result are correct.

Thank you for your attention and sorry for the inconvenience.

Can I use your data for my further work?

vlawhern commented 5 years ago

Glad to help. Feel free to use the data I sent you, with the appropriate citation for the data source:

Tangermann, Michael, et al. "Review of the BCI competition IV." Frontiers in neuroscience 6 (2012): 55.

Cosmopal commented 5 years ago

@vlawhern can the above code be added as part of demo code/demo data as per issue #6 Also, the link https://www.dropbox.com/s/cxwyigcx4j6h8t5/EEGNet_BCI_IV.zip?dl=0 isn't working

vlawhern commented 5 years ago

My main concern with hosting that particular dataset was that I don't own it (property of the Berlin BCI group). I only temporarily put up my code to analyze that dataset to help another user (issue #6 as you've referenced).

I've instead written a sample script using MNE's auto-download tools to automatically download another dataset (in this case, a 4-class ERP EEG classification dataset) to illustrate how to use and train the EEGNet model. Hopefully this helps.