NDCLab / pepper-pipeline

tool | Python Easy Pre-Processing EEG Reproducible Pipeline
GNU Affero General Public License v3.0
3 stars 3 forks source link

final-reject ValueError #219

Open F-said opened 3 years ago

F-said commented 3 years ago

Describe the bug For the sub-NDARAB793GL3_ses-01_task-Video1_run-01_eeg file, the following error occurs during final_reject

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2. Computing thresholds ...: 0%| | 0/129 [00:00<?, ?it/s]

Expected behavior No crashes should be expected during running of pipeline

Screenshots Entire error message is as follows:

File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 827, in dispatch_one_batch
    tasks = self._ready_batches.get(block=False)
  File "C:\Users\saidmf\anaconda3\lib\queue.py", line 167, in get
    raise Empty
_queue.Empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "c:/Users/saidmf/Desktop/NDC/baseEEG/run.py", line 32, in <module>
    eeg_obj, outputs[idx] = getattr(preprocess, func)(eeg_obj, **params)
  File "c:\Users\saidmf\Desktop\NDC\baseEEG\scripts\preprocess\preprocess.py", line 292, in final_reject_epoch
    autoRej.fit(epochs)
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 985, in fit
    _run_local_reject_cv(epochs, thresh_func, this_picks,
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 747, in _run_local_reject_cv       
    local_reject.fit(epochs)
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 661, in fit
    self.threshes_ = self.thresh_func(
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 441, in _compute_thresholds        
    threshes = parallel(
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 1048, in __call__      
    if self.dispatch_one_batch(iterator):
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 784, in _dispatch      
    job = self._backend.apply_async(batch, callback=cb)
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 262, in __call__       
    return [func(*args, **kwargs)
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 262, in <listcomp>     
    return [func(*args, **kwargs)
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 353, in _compute_thresh
    best_thresh, _ = bayes_opt(func, initial_x,
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\bayesopt.py", line 45, in bayes_opt
    if not np.isinf(f(x)):
  File "C:\Users\saidmf\anaconda3\lib\site-packages\autoreject\autoreject.py", line 343, in func
    obj = -np.mean(cross_val_score(est, this_data, y=y, cv=cv))
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 72, in inner_f
    return f(**kwargs)
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 401, in cross_val_score
    cv_results = cross_validate(estimator=estimator, X=X, y=y, groups=groups,
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\utils\validation.py", line 72, in inner_f
    return f(**kwargs)
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 242, in cross_validate
    scores = parallel(
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 1048, in __call__      
    if self.dispatch_one_batch(iterator):
  File "C:\Users\saidmf\AppData\Roaming\Python\Python38\site-packages\joblib\parallel.py", line 838, in dispatch_one_batch
    islice = list(itertools.islice(iterator, big_batch_size))
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\model_selection\_validation.py", line 242, in <genexpr>    
    scores = parallel(
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\model_selection\_split.py", line 1341, in split
    for train, test in self._iter_indices(X, y, groups):
  File "C:\Users\saidmf\anaconda3\lib\site-packages\sklearn\model_selection\_split.py", line 1668, in _iter_indices    
    raise ValueError("The least populated class in y has only 1"
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
stevenwtolbert commented 3 years ago

So it appears that autoreject splits the data into K folds (defaulted to 10) and within each fold you have a train set and a test set.

WITHIN THE TRAIN SET OF EACH FOLD: """ apply threshold τi to reject trials in the train set calculate the mean of the signal (for each sensor and timepoint) over the GOOD (=not rejected) trials in the train set """ WITHIN THE TEST SET OF EACH FOLD: "calculate the median of the signal (for each sensor and timepoint) over ALL trials in the test set"

WITHIN EACH FOLD: "compare both of these signals and calculate the error ek (i.e., take the Frobenius norm of their difference)"

What I THINK is happening:

If when we apply threshold τi to the train set and we end up rejecting all channels, we aren't able to compute the mean of non-rejected channels and therefore can't find an error for that fold, and I assume how the code is handling this on the autoreject side is to have rejected vs non-rejected channels be marked by the label variable y, a similar error should occur if all channels are accepted if this is correct.

I'll keep looking into this.