eeyhsong / EEG-Conformer

EEG Transformer 2.0. i. Convolutional Transformer for EEG Decoding. ii. Novel visualization - Class Activation Topography.
GNU General Public License v3.0
426 stars 59 forks source link

Question on conformer_seed_1s_5fold.py #41

Open edugm94 opened 1 week ago

edugm94 commented 1 week ago

Dear authors,

Thanks for sharing the code! I am trying to run it on the SEED dataset. I am trying to understand how the 5-fold cross validation is done.

I have run seed.m and seed_process_slide_cv.py, to prepare the SEED dataset. When I run the script: conformer_seed_1s_5fold.py I get an error in the method get_source_data(). What shape should have the attribute self.all_data when running this line of code: self.all_data = np.load(self.root + 'S%d_session1.npy' % self.nSub, allow_pickle=True).

Maybe I did a mistake when preprocessing the dataset. Any help is more than welcome!

Thanks in advance :D

edugm94 commented 1 week ago

I modifed the script seed_process_slice_cv.py in the following way:

data = np.array(one_session)
label = np.array(one_session_label)

To the following:

data = np.concatenate(one_session)
label =  np.concatenate(one_session_label)

In this way the numpy array saved to this has the form: (3394, 62, 200) Is this shape not the expected? Is this modification causing the error in the get_source_data() method I mentioned above?

Thanks!

rokanfeermecer486 commented 1 week ago

Hello, I have a question about why this project needs to use validation set accuracy as the final result. I haven't seen this project separate the original "T" into two parts for training and validation, while "E" is used for testing; Why can we directly use "T ·" as training and "E" as validation? Or the recognized methods are all done in this way.

eeyhsong commented 1 week ago

Hello @edugm94, Thanks for your interest! The input of the net is like (N, 1, channel, time samples). Do you need to use np.expand_dims? What are the error details?

eeyhsong commented 1 week ago

Hello @rokanfeermecer486, It's a tough question. The conformer paper only validated the 'ideal' performance and compared with other methods, such as ConvNet and EEGNet, in the same cases. You may find some explanations of the validation strategy in this paper https://doi.org/10.1016/j.neuroimage.2023.120209.

rokanfeermecer486 commented 1 week ago

@eeyhsong Yes, I am currently struggling with this issue during my graduate studies. In fact, I should focus more on the innovation and practicality of the model, as well as the effectiveness of the experiments. I have conducted many EEG experiments, but I still do not have a good idea on how to improve the performance of the results. I am just constantly trying, and there are various methods in the literature. I hope that in the future, I can also have a unique approach and understanding of the application of deep learning models in EEG processing. Very good project, thank you very much for your reply, and I wish you a smooth scientific research work.

edugm94 commented 6 days ago

@eeyhsong, Thanks for your response. I think I fixed it. Could you please confirm my understanding of the code?

What I understood is the following: A model is trained for each participant and session (S1_session1.npy, for example). The evaluation is 5-fold. So to get train and test sets, it is done the following: Each trial (each session is composed of 15 trials), is chunked into 5 folds. One of the folds is saved for test and the remaining are used in testing. This process is done for all 15 trials.

The code associated with the previous explanation is attached below (happening in method get_source_data() [lines 299 - 326] in the python script conformer_seed_1s_5fold.py

        self.all_data = np.load(self.root + 'S%d_session1.npy' % self.nSub, allow_pickle=True)
        self.all_label = np.load(self.root + 'S%d_session1_label.npy' % self.nSub, allow_pickle=True)
        self.train_data = []
        self.train_label = []
        self.test_data = []
        self.test_label = []

        for tri in range(np.shape(self.all_data)[0]):
            tmp_tri = np.array(self.all_data[tri])
            tmp_tri_label = np.array(self.all_label[tri])

            one_fold_num = np.shape(tmp_tri)[0] // 5
            tri_num =  one_fold_num * 5
            tmp_tri_idx = np.arange(tri_num)
            test_idx = np.arange(one_fold_num * fold, one_fold_num * (fold+1))
            train_idx = np.delete(tmp_tri_idx, test_idx)

            self.train_data.append(tmp_tri[train_idx])
            self.train_label.append(tmp_tri_label[train_idx])
            self.test_data.append(tmp_tri[test_idx])
            self.test_label.append(tmp_tri_label[test_idx])

        self.train_data = np.concatenate(self.train_data)
        self.train_data = np.expand_dims(self.train_data, axis=1)
        self.train_label = np.concatenate(self.train_label)
        self.test_data = np.concatenate(self.test_data)
        self.test_data = np.expand_dims(self.test_data, axis=1)
        self.test_label = np.concatenate(self.test_label)

Can you please confirm that I understood it correctly?

Thanks so much for your time and response!