skorch-dev / skorch

A scikit-learn compatible neural network library that wraps PyTorch
BSD 3-Clause "New" or "Revised" License
5.86k stars 390 forks source link

RandomizedSearchCV not working #794

Closed bokey007 closed 2 years ago

bokey007 commented 3 years ago

Hi I am trying to tune hyper parameters of my model with RandomizedSearchCV.

But I am getting following error: TypeError: init() got an unexpected keyword argument 'on_train'

Looking forward to your help

Thanks and regards Bokey

BenjaminBossan commented 3 years ago

Could you please share your code (especially the net initialization and the hyper-params)?

bokey007 commented 3 years ago
import torch
from braindecode.util import set_random_seeds
from braindecode.models import ShallowFBCSPNet

from braindecode.models import EEGNetv4

cuda = torch.cuda.is_available()  # check if GPU is available, if True chooses to use it
device = 'cuda' if cuda else 'cpu'
if cuda:
    torch.backends.cudnn.benchmark = True
seed = 20200220  # random seed to make results reproducible
# Set random seed to be able to reproduce results
set_random_seeds(seed=seed, cuda=cuda)

n_classes = 4
# Extract number of chans and time steps from dataset
n_chans = train_set[0][0].shape[0]
input_window_samples = train_set[0][0].shape[1]

model = EEGNetv4(
    n_chans,
    n_classes,
    input_window_samples=input_window_samples,
    final_conv_length='auto',
)

# Send model to GPU
if cuda:
    model.cuda()

from skorch.callbacks import LRScheduler, EarlyStopping, Checkpoint
from skorch.helper import predefined_split

from braindecode import EEGClassifier

lr = 0.0625 * 0.01
weight_decay = 0

batch_size = 64
n_epochs = 300

net = EEGClassifier(
    model,
    criterion=torch.nn.NLLLoss,
    optimizer=torch.optim.AdamW,
    #train_split=predefined_split(valid_set),  # using valid_set for validation
    optimizer__lr=lr,
    optimizer__weight_decay=weight_decay,
    batch_size=batch_size,
    callbacks=[
        "accuracy", ("lr_scheduler", LRScheduler('CosineAnnealingLR', T_max=n_epochs - 1)),
        ("early_stopping", EarlyStopping(patience=100)),
        ("chpt save best", Checkpoint(dirname='CKP0'))

    ],
    device=device,
)

from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
from scipy import stats

pipe = Pipeline([('net', net)])

NUM_CV_STEPS = 10

################################## tuning
pipe.set_params(net__verbose=0, net__train_split=None)

params = {

    'net__lr': [10**(-stats.uniform(1, 5).rvs()) for _ in range(NUM_CV_STEPS)],
    #'clf__max_epochs': [5, 10],
}

search = RandomizedSearchCV(
    net, params, n_iter=NUM_CV_STEPS, verbose=2, refit=False, scoring='accuracy', cv=3)

search.fit(train_set,  y=None)]

Error message :

File "/home/bokey/anaconda3/envs/BCI_torch_mnnet_tf2/lib/python3.8/site-packages/sklearn/base.py", line 77, in clone new_object = klass(**new_object_params)

TypeError: init() got an unexpected keyword argument 'on_train'

BenjaminBossan commented 3 years ago

This line:

"accuracy", ("lr_scheduler", LRScheduler('CosineAnnealingLR', T_max=n_epochs - 1)),

is broken. you cannot have the string "accuracy" as a callback. Could you try if removing that solves the issue?

If you want to track accuracy, consider using the EpochScoring callback

bokey007 commented 3 years ago

error after removing it :

File "/home/bokey/anaconda3/envs/BCI_torch_mnnet_tf2/lib/python3.8/site-packages/skorch/net.py", line 1576, in _check_kwargs raise TypeError(full_msg)

TypeError: init() got unexpected argument(s) _last_window_inds. Either you made a typo, or you added new arguments in a subclass; if that is the case, the subclass should deal with the new arguments explicitly.

BenjaminBossan commented 3 years ago

This _last_window_inds seems to come from braindecode. I have no idea what it does or if it's necessary. If you need to keep that, honestly, the easiest thing would probably be to override _check_kwargs to ignore this variable:

class MyNet(EEGClassifier):
    def _check_kwargs(self, kwargs):
        if '_last_window_inds' in kwargs:
            kwargs = kwargs.copy()
            del kwargs['_last_window_inds']
        return super()._check_kwargs(kwargs)

Also, could you check if '_last_window_inds' in net.get_params(deep=False)? If so, there might be another solution.

BenjaminBossan commented 2 years ago

Any updates @bokey007?

bruAristimunha commented 2 years ago

Hello @BenjaminBossan,

I am having a similar problem. I was trying to use a tutorial from braindecode for including a GridSearchCV and got the error:

TypeError: get_params() got an unexpected keyword argument 'deep'

I tried your solution above, changing the EEGClassifier, but the estimator from braindecode can access the option deep=False. It seems that something is lost when EEGClassifier has inherited the skorch module.

https://colab.research.google.com/drive/1tPgJJqOp8HA5BFek9NhslKYKdH4ak2XM?usp=sharing

BenjaminBossan commented 2 years ago

Just so I understand correctly: When you run clf.get_params(deep=True), everything works correctly, but in the grid search, you get TypeError: get_params() got an unexpected keyword argument 'deep'? That looks very strange to me.

Could you please try:

from sklearn.base import clone
clone(clf)

If it fails, can you enter the debug mode and inspect the what the variable estimator is?

bruAristimunha commented 2 years ago

Exactly! Return when I tried to clone the value clf:

image

When I inspected the value in the debug mode, I got this return:

image

I debugged for a period and discovered that a submodule inside the braindecode caused it, so not related to skorch.

I appreciate your help!

BenjaminBossan commented 2 years ago

Thanks @bruAristimunha for following up. I'll close the issue for now, since it seems to be unrelated to skorch.

Just one note, in the screenshot, there is a typo, get_parars instead of get_params, but that's probably unrelated to the initial problem.