Closed luohongyu closed 5 years ago
Hi:
The hyperparameters used in train.py are basically derived from the description of the paper and the theano-based code provided by the original author.
In general, the hyperparameters of deep learning can be divided into two parts, one is to control the training process, such as learning rate and batch size, and the other part is about the structure and size of the model.
For example, the learning rate you can try 1e-2, 1e-3, 5e-4, 1e-4, etc., which determines the speed of the gradient update.
As for the model parameters, the structure of the model determines what tasks it applies to. And the depth of the network and filters numbers in conv layer determine the fitting ability of the model. Generally speaking, the deeper the network, the more the number of filters, the stronger the model's ability to fit. On the other hand, it also means that it is easy to overfit on small data sets.
So you need to balance the size of the model based on the size of your dataset.
More importantly, since EEGLearn used frequency domain features, it's necessary to ensure that the frequency domain features are useful for your task.
Best wishes.
Hi:
thanks for the reply!
I tried to simply plug in my data(just by changing the input of train, validation, test with the same dimension that you mentioned) on your model but the training results are really strange for me. The training accuracy is getting lower and lower. train and test always keep exaclty 50%. But at the same time you could observe that the loss is getting lower and lower.
i have no idea why it looks like this. i used pytorch but never used tensorflow before so i don't know if that could be a potential bug on my code? bacially i didn't touch anything but just the input.
Do you have any idea why it could became like this situation?
i really appreciate your help
best
What is your data set? Make sure your task have something in common with the task in the original paper and do similar preprocessing.
you can try to reduce the learning rate, which is a bit like the oscillation caused by the excessive learning rate.
change mode_type to be 'cnn' and reduce the size of the model.
you can run tensorboard --logdir runs under /EEGLearn to view the monitoring curve of loss and acc.
Hi everyone,
I have this problem too, I a using the "EEG Motor Movement/Imagery Dataset" (https://www.physionet.org/content/eegmmidb/1.0.0/) and it's a popular dataset, other people arrived at ~80 % accuracy and I got this accuracy with braindecode (CNN). so I am trying EEGLearn (RNN-CNN) but I'm still on the chance level. Do you have any suggestions?
Hi:
thanks for the github repository about the code of this paper. it's very organized and helps me a lot!
But I have one more practical question about the selection of hyperparameters. In your code, all the parameters are already fixed. But in the original paper, they just talked about the specific architecture but not saying some specific parameters.
I tried to run your model directly on my data. it gives me just chance accuracy. i'm right now quite confused with where should i start to tune the hypermeters and what range should i do for the grid search of parameters? Based on your experience, do you have some suggestion about the parameters that you think might be very useful to improve the model?
thanks and best
Luo