ELEMKEP / bsc_lcs

Implementation of Brain Signal Classification via Learning Connectivity Structure
MIT License
16 stars 4 forks source link

Question: Train Test Split #3

Closed msseibel closed 3 years ago

msseibel commented 3 years ago

Why am writing this: I like the idea of learning connectivity and I am considering to use it in my research, but I am skeptical about your results and whether the model would generalize to unseen subjects.

In particular I don't understand your train and test split.

  1. Is the train / test split defined in the lmdb_generate_rev.py file?
  2. What is the meaning of the indices_list in lmdb_generate_rev? Considering that the largest index is 58 I assume that the indices represent different (overlapping) time frames, which are cropped from the 1 minute recordings. That would mean every 1 minute recording is part of the train set aswell as of the test set. For example the indices [0, 12, 0, 10] mean that the segments 12 to 58 are used for training and the segments 0 to 10 are used for testing.
  3. Following question 2: I assume that the set of training subjects is not disjoint to the set of subjects which are evaluated during testing. AFAIK the BCI guys also perform such experiments for motor imagery and call it within subject accuracy. However, usually we are also interested in how good the neural network performs across subjects. So what about five experiments where e.g. 26 subjects are used for training and 32-26=6 subjects are used for testing? (Maybe I have missed that experiment)
ELEMKEP commented 3 years ago

Thanks for you interest!

  1. Actually, lmdb_generate_rev.py is in work for revision and so that not completed. I will update the code to the master branch after the work finished.
  2. What I'm working on is implementing k-fold validation in time-domain.
  3. You're right. The data splitting scheme in the work is called "subject-dependent", which does not explicitly divide training subjects and test subjects but mixes them. This setting is an option used widely in the field of affective computing along with the "subject-independent" (same to "across subjects" you mentioned) so I adopted this setting.