vlawhern / arl-eegmodels

This is the Army Research Laboratory (ARL) EEGModels Project: A Collection of Convolutional Neural Network (CNN) models for EEG signal classification, using Keras and Tensorflow
Other
1.17k stars 287 forks source link

Problem using EEGNet on ERN dataset #47

Open ZhYGu opened 1 year ago

ZhYGu commented 1 year ago

I followed the code in the example for applying on ERP dataset and tried it on ERN dataset. However, the result shows as random classification. I applied bandpass filter too, but I cannot understand what makes it a random classification. Do you have any suggestion?

okbalefthanded commented 1 year ago

@ZhYGu can you provide your code ?

ZhYGu commented 1 year ago

The following is my code

from numpy import import numpy as np import glob import re from pylab import from scipy.signal import import pandas as pd from sklearn.preprocessing import

def bandpass(sig,band,fs): B,A=butter(5,array(band)/(fs/2),btype='bandpass') return lfilter(B,A,sig,axis=0)

training preprocessing

Root_folder='/home/zhiyangg/projects/rpp-doesburg/zhiyangg/dataset/ERN/' Training_folder='/home/zhiyangg/projects/rpp-doesburg/zhiyangg/dataset/ERN/train/' Training_files=glob.glob(Trainingfolder+'Data*.csv') Training_files.sort() reg=re.compile('\d+')

freq=200 epoc_window=int(1.25*freq)

X=[] User=[] idFeedBack=[] Session=[] Feedback=[] Letter=[] Word=[] FeedbackTot=[] LetterTot=[] WordTot=[]

filter to 1,40 hz and downsample to 128 hz

for f in Training_files: user,session=reg.findall(f) sig=np.array(pd.io.parsers.read_csv(f))

EEG=sig[:,1:-2]
#EOF=sig[:,-2]
Trigger=sig[:,-1]

sigF=bandpass(EEG,[1.0,40.0],freq)
idxFeedBack=np.where(Trigger==1)[0]
for fbkNum,idx in enumerate(idxFeedBack):
    resampled=resample(sigF[idx:idx+epoc_window,:],int(epoc_window*128/200),axis=0)
    X.append(resampled)

Training_Labels=array(genfromtxt(Root_folder+'TrainLabels.csv',delimiter=',',skip_header=1)[:,1]) X=array(X).transpose((0,2,1))

testing data

test_folder='/home/zhiyangg/projects/rpp-doesburg/zhiyangg/dataset/ERN/test/' test_files=glob.glob(testfolder+'Data*.csv') testfiles.sort() X=[] for f in test_files: user,session=reg.findall(f) sig=np.array(pd.io.parsers.read_csv(f))

EEG=sig[:,1:-2]
#EOF=sig[:,-2]
Trigger=sig[:,-1]

sigF=bandpass(EEG,[1.0,40.0],freq)
idxFeedBack=np.where(Trigger==1)[0]
for fbkNum,idx in enumerate(idxFeedBack):
    resampled=resample(sigF[idx:idx+epoc_window,:],int(epoc_window*128/200),axis=0)
    X_.append(resampled)

testX=array(X).transpose((0,2,1))

normalization

for i in range(len(X[0])): scaler=StandardScaler() scaler.fit(X[:,i,:]) X[:,i,:]=scaler.transform(X[:,i,:]) test_X[:,i,:]=scaler.transform(test_X[:,i,:])

from EEGModels import EEGNet import tensorflow as tf from tensorflow.keras import utils as np_utils from tensorflow.keras.callbacks import ModelCheckpoint from tensorflow.keras import backend as K

from sklearn.pipeline import make_pipeline from sklearn.linear_model import LogisticRegression from matplotlib import pyplot as plt

Training Validation segmentation

Zeros=[] Ones=[] Labels=[] for i in range(0,len(X)): if Training_Labels[i]==0: Zeros.append(X[i]) Labels.append(0) else: Ones.append(X[i]) Labels.append(1)

Training=Zeros[0:int(0.8len(Zeros))] Training=Training+Ones[0:int(0.8len(Ones))] Validate=Zeros[int(0.8len(Zeros)):] Validate=Validate+Ones[int(0.8len(Ones)):] Training_Label=[0]int(0.8len(Zeros)) Training_Label+=[1]int(0.8len(Ones)) Validate_Label=[0]int(0.2len(Zeros)) Validate_Label+=[1]int(0.2len(Ones))

Training=array(Training) Validate=array(Validate) Training_Label=array(Training_Label) Validate_Label=array(Validate_Label)

chans=Training.shape[1] samples=Training.shape[2] kernels=1

Training_Label=np_utils.to_categorical(Training_Label) Validate_Label=np_utils.to_categorical(Validate_Label)

Training_data=Training.reshape(Training.shape[0],chans,samples,kernels) Validate_data=Validate.reshape(Validate.shape[0],chans,samples,kernels)

print('Training shape:', Training_data.shape) print(Training_data.shape[0], 'train samples')

model=EEGNet(nb_classes=2,Chans=chans,Samples=samples, dropoutRate=0.5,kernLength=32,F1=8,D=2,F2=16, dropoutType='Dropout')

Training

opt=tf.keras.optimizers.SGD(0.0001) model.compile(loss='categorical_crossentropy',optimizer=opt,metrics=['accuracy']) numParams=model.count_params() checkpointer=ModelCheckpoint(filepath='/home/zhiyangg/projects/rpp-doesburg/zhiyangg/dataset/ERN/ERN_checkpoint1.h5',verbose=1,save_best_only=True) class_weights={0:1,1:2} fittedModel=model.fit(Training_data,Training_Label,batch_size=16,epochs=100,verbose=2,validation_data=(Validate_data,Validate_Label),callbacks=[checkpointer],class_weight=class_weights) model.load_weights('/home/zhiyangg/projects/rpp-doesburg/zhiyangg/dataset/ERN/ERN_checkpoint1.h5')

plt.plot(fittedModel.history['loss']) plt.plot(fittedModel.history['val_loss']) plt.title('model loss') plt.ylabel('loss') plt.xlabel('epoch') plt.legend(['train','val'],loc='upper left') plt.show()

True_Labels=array(genfromtxt(Root_folder+'true_labels.csv',delimiter=',',skip_header=0)) print(True_Labels)

Y_test=np_utils.to_categorical(True_Labels)

X_test=test_X.reshape(test_X.shape[0],chans,samples,kernels)

probs=model.predict(X_test)

preds=probs.argmax(axis=-1)

acc=np.mean(preds==Y_test.argmax(axis=-1)) print("Classification accuracy: %f " % (acc))

import sklearn.metrics as metrics

fpr,tpr,thresholds=metrics.roc_curve(True_Labels,preds,pos_label=1) auc=metrics.auc(fpr,tpr)

print(auc)

ZhYGu commented 1 year ago

Additionally, after training, I plot the training and validation loss, but the validation loss is less than training loss. If you can find the problem related to it in my code, please tell me too. Thank you

ZhYGu commented 1 year ago

@okbalefthanded image This is the plot and the result.

okbalefthanded commented 1 year ago

@ZhYGu to replicate the paper's results, use the same traning configuration, change the optimizer to Adam, batch size to 32 and epochs to 500. The learning curves are the typical behaviour observed when using dropout. During training some neurons ouput are set to 0 which makes the loss for training higher than the one for validation.

ZhYGu commented 1 year ago

@ZhYGu to replicate the paper's results, use the same traning configuration, change the optimizer to Adam, batch size to 32 and epochs to 500. The leaening curves are the typical behaviour observed when using dropout. During training some neurons ouput are set to 0 which makes the loss for training higher than the one for validation.

Thanks for your advice. I will run the experiment follow your suggestion.

ZhYGu commented 1 year ago

@ZhYGu to replicate the paper's results, use the same traning configuration, change the optimizer to Adam, batch size to 32 and epochs to 500. The learning curves are the typical behaviour observed when using dropout. During training some neurons ouput are set to 0 which makes the loss for training higher than the one for validation.

image This is the result from the new experiment using the settings mentioned above. It seems to not be improved a lot.