NeuroTechX / moabb

Mother of All BCI Benchmarks
https://neurotechx.github.io/moabb/
BSD 3-Clause "New" or "Revised" License
689 stars 179 forks source link

Sample chronologically per default #187

Open jsosulski opened 3 years ago

jsosulski commented 3 years ago

Currently, moabb stratifies and picks random samples from X for training (T)/validation (V) respectively. A simple assignment vector could look like this:

[T,T,T,V,T,V,V,T,T,V,T,V,T,T]

Especially in ERP datasets, when successive epochs are overlapping, this way of sampling will lead to statistical information (e.g. signal mean and signal covariance) leaking between train and test epochs. There are two ways to mitigate this issue:

  1. Do the train / validation assignment not on individual labels but on run-level, e.g., a sequence of N flashes that were done in the underlying P300 speller experiment.
  2. If run information is not available, we could either infer it from marker timings, or simply use contiguous folds, e.g.:
[T,T,T,T,T,T,T,T,T,V,V,V,V,V]

Then we would have a possible leak from train to validation epochs at only one intersection between train / validation label assignments.

Doing this will probably reduce average performances achieved with all classifier types, however, less so for classifiers that have better generalization properties.

sylvchev commented 3 years ago

Yes, this is a step toward evaluations that are closer to real use of BCI, with online setting. It could be an issue for within-session evaluation, but there is no problem for cross-session/cross-subject evaluation. Within-session evaluation relies on 5-fold cross-validation, are you suggesting to switch to a single train-test split or to a something like a group k-fold?

jsosulski commented 3 years ago

GroupKFold would be the ideal when we have meaningful segmentation between parts of the experiment, e.g., one stimulation sequence to detect a new letter. However, in the absence of this information for each dataset - and If I understand StratifiedKFold correctly - we could also just set shuffle=False in the crossvalidation call?

jsosulski commented 3 years ago

I just found out about TimeSeriesSplit which sounds interesting, as it both allows plotting of kind of a learning curve as well as reflect an online BCI setup. See this plot from the sklearn docs:

grafik