YangWangsky / tf_EEGLearn

A tensorflow implementation for EEGLearn
MIT License
77 stars 31 forks source link

question about the selection of hyperparameter #2

Closed luohongyu closed 5 years ago

luohongyu commented 5 years ago

Hi:

thanks for the github repository about the code of this paper. it's very organized and helps me a lot!

But I have one more practical question about the selection of hyperparameters. In your code, all the parameters are already fixed. But in the original paper, they just talked about the specific architecture but not saying some specific parameters.

I tried to run your model directly on my data. it gives me just chance accuracy. i'm right now quite confused with where should i start to tune the hypermeters and what range should i do for the grid search of parameters? Based on your experience, do you have some suggestion about the parameters that you think might be very useful to improve the model?

thanks and best

Luo

YangWangsky commented 5 years ago

Hi:

The hyperparameters used in train.py are basically derived from the description of the paper and the theano-based code provided by the original author.

In general, the hyperparameters of deep learning can be divided into two parts, one is to control the training process, such as learning rate and batch size, and the other part is about the structure and size of the model.

For example, the learning rate you can try 1e-2, 1e-3, 5e-4, 1e-4, etc., which determines the speed of the gradient update.

As for the model parameters, the structure of the model determines what tasks it applies to. And the depth of the network and filters numbers in conv layer determine the fitting ability of the model. Generally speaking, the deeper the network, the more the number of filters, the stronger the model's ability to fit. On the other hand, it also means that it is easy to overfit on small data sets.

So you need to balance the size of the model based on the size of your dataset.

More importantly, since EEGLearn used frequency domain features, it's necessary to ensure that the frequency domain features are useful for your task.

Best wishes.

luohongyu commented 5 years ago

Hi:

thanks for the reply!

I tried to simply plug in my data(just by changing the input of train, validation, test with the same dimension that you mentioned) on your model but the training results are really strange for me. The training accuracy is getting lower and lower. train and test always keep exaclty 50%. But at the same time you could observe that the loss is getting lower and lower.

i have no idea why it looks like this. i used pytorch but never used tensorflow before so i don't know if that could be a potential bug on my code? bacially i didn't touch anything but just the input.

Do you have any idea why it could became like this situation?

i really appreciate your help

best


Epoch 1 of 60 took 6.685s Train Epoch [1/60] train_Loss: 16.0912 train_Acc: 88.97 Val Epoch [1/60] val_Loss: 37.3079 val_Acc: 50.00 Test Epoch [1/60] test_Loss: 37.3095 test_Acc: 50.00

Epoch 2 of 60 took 2.708s Train Epoch [2/60] train_Loss: 5.9445 train_Acc: 77.63 Val Epoch [2/60] val_Loss: 17.9065 val_Acc: 50.00 Test Epoch [2/60] test_Loss: 17.9028 test_Acc: 50.00

Epoch 3 of 60 took 2.741s Train Epoch [3/60] train_Loss: 3.3270 train_Acc: 66.67 Val Epoch [3/60] val_Loss: 8.7729 val_Acc: 50.00 Test Epoch [3/60] test_Loss: 8.7700 test_Acc: 50.00

Epoch 4 of 60 took 2.770s Train Epoch [4/60] train_Loss: 1.6384 train_Acc: 56.31 Val Epoch [4/60] val_Loss: 1.4552 val_Acc: 50.00 Test Epoch [4/60] test_Loss: 1.4548 test_Acc: 50.00

Epoch 5 of 60 took 2.768s Train Epoch [5/60] train_Loss: 1.0130 train_Acc: 46.75 Val Epoch [5/60] val_Loss: 1.1150 val_Acc: 50.00 Test Epoch [5/60] test_Loss: 1.1149 test_Acc: 50.00

Epoch 6 of 60 took 2.781s Train Epoch [6/60] train_Loss: 1.0584 train_Acc: 41.79 Val Epoch [6/60] val_Loss: 0.8289 val_Acc: 50.00 Test Epoch [6/60] test_Loss: 0.8289 test_Acc: 50.00

YangWangsky commented 5 years ago

What is your data set? Make sure your task have something in common with the task in the original paper and do similar preprocessing.

  1. you can try to reduce the learning rate, which is a bit like the oscillation caused by the excessive learning rate.

  2. change mode_type to be 'cnn' and reduce the size of the model.

  3. you can run tensorboard --logdir runs under /EEGLearn to view the monitoring curve of loss and acc.

MohammadJavadD commented 5 years ago

Hi everyone,

I have this problem too, I a using the "EEG Motor Movement/Imagery Dataset" (https://www.physionet.org/content/eegmmidb/1.0.0/) and it's a popular dataset, other people arrived at ~80 % accuracy and I got this accuracy with braindecode (CNN). so I am trying EEGLearn (RNN-CNN) but I'm still on the chance level. Do you have any suggestions?