mne-tools / mne-python

MNE: Magnetoencephalography (MEG) and Electroencephalography (EEG) in Python
https://mne.tools
BSD 3-Clause "New" or "Revised" License
2.7k stars 1.31k forks source link

CSP with rest state EEG data for predicting stage of disease #6716

Closed akatav closed 5 years ago

akatav commented 5 years ago

Hello. It'd be really helpful if an expert here shed some light on using CSP for resting state EEG data. I am trying to predict early and final stages of brain disease. I though i could use CSP for this task as i find some new papers in CSP that work with rest eeg data. Each instance (or patient whose eeg is taken) is a (19*number of sampling time points) matrix. 19 is the number of channels. I do a train-test split of 70%-30%, eventually, would like to learn from a 50-50% split. For each of the train and test sets, a band pass filter is applied for each frequency range and events are created with make_fixed_length_events(). We tried 30s events with 15s overlap and lesser. I also tried 5s events with 2s overlaps and so on. Now, i understand that in resting state, no events are there actually but i did it this way to use the CSP API. Of course, all epochs for a given instance, in this case, is strictly binary (early/final)

A brief snapshot of my code is as follows:

# for each frequency band
idd=1
for freq, (fmin,fmax) in enumerate(freq_ranges):
    # for each instance/patient
    for raw in trainrawarr:   
        picks = pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,               exclude='bads')
        raw_filter=raw.copy().filter(fmin, fmax, n_jobs=-1, fir_design="firwin")
        events=make_fixed_length_events(raw_filter, id=idd, duration=30., overlap=15.)
        epochs=Epochs(raw_filter, events, picks=picks)
       # epochs contains a 3d array, (number of epochs, 19, number of time points)
       # all epochs in this instance are 1 or 0 indicating stage of disease. 
idd=idd+1 # not sure if using unique ids is helpful. 

We repeat the above for the testrawarr also.

I use sklearn GridSearch with KFoldK=2,5,10 using different classifiers with parameter tuning, such as lda, qda, decision tree, svc, knn. The training roc_auc is either very poor or very good. Test ROC is less than 50%. I use the np.vstack() method to concatenate all the 3 dimensional epochs in train or test.

csp=CSP(reg=None, log=True, norm_trace=False, cov_est="epoch")
param_grid_lda = [{'lda__solver': ['lsqr'], 'lda__shrinkage': [0.0001, 0.001, 0.010, 0.1, 1], 
                  'csp__n_components': [5,10, 20, 30], 'csp__reg': [0.00001, 0.0001,0.001,0.01,0.1,1]
                  }]
for freq in range(8): # there are 8 frequency bands
    for split in [5,10]:
        print("Evaluating cv split: ", split, " in freq range: ", freq_ranges[freq])
        clf=Pipeline([('csp', csp), ('lda', lda)])
        search=GridSearchCV(clf, cv=StratifiedKFold(split), param_grid=param_grid_lda, scoring='roc_auc', n_jobs=-1)
        model=search.fit(np.vstack(epochs_all_freq_ranges[freq]), epoch_labels[freq])
        testpred=search.predict(np.vstack(test_epochs_all_freq_ranges[freq]))

Can someone please explain what is wrong with this code or approach ? Would be really helpful. Thanks!

agramfort commented 5 years ago

what makes you think it's an MNE issue and not a problem with the data?

akatav commented 5 years ago

@agramfort i do not think that it is an MNE issue, rather, perhaps an issue with using CSP in MNE with resting eeg. I am not sure if/how to use MNE with resting data. The data, maybe at fault here but i have received this data after cleaning, artifact correction and other preprocessing steps. i am also scaling the data using the sklearn MinMaxScaler before the above analysis.

agramfort commented 5 years ago

please don't use the issue tracker for something that is not a bug in the software.

akatav commented 5 years ago

@agramfort ok, but is there a forum for raising questions about usage.i did not raise this under 'bug'. i raised it under 'blank issue' only.

agramfort commented 5 years ago

list: https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis gitter https://mail.nmr.mgh.harvard.edu/mailman/listinfo/mne_analysis