Closed shabu19 closed 2 years ago
https://jmlr.org/papers/v15/srivastava14a.html
Based on the original paper proposing dropout. Keep dropout between 0.2 < Dropout < 0.5. As you can see when you used 0.9 as your dropout rate, there was over fitting occurring which is not good and is the opposite intent of the use of dropout. When you have 0.5 you have your accuracy oscillating and that might be due two reasons. The first reason being that your learning rate needs is too high and needs to lowered and the other being that your batch size is too small, thus needs to be increased.
@saichatla I tried different dropout rates and here are the results. I used 64 as batch size and 0.01 as learning rate
using dropout 0.4
using dropout 0.5
using dropout 0.6
using dropout 0.7
in all these cases the val_accuracy stagates at 50%. I read somewhere that if validation set does not have enough data than this might occur. but we are talking about 3k+ different classes folders with 200+ samples in each folder (UCF-101 Dataset)
Another reason I read for this is that there might not be enough randomness in the validation set, which is also not the case.
What can be the potential reason for that? Thank you.
Could you rerun these tests by lowering the learning rate to something small like .00001 and increasing the batch size. Thank you!
Yes, I have run my model on 0.01 now. Sorry it was 0.1 previously. More over I'm using batch size 128 for 50 classes only, because i cant fix batch size more than 64 for 101 classes due to memory limitation. I will get back to you when I get the results. Thank You!
@saichatla I trained my model this time on 25 classes only to get an idea of how it will work for 101 classes. I used learning rate of 0.0001 while keeping batch size of 128 and a dropout of 0.5 after both of the dense layers . The validation accuracy again stagnates at 60%.
@shabu19 Is this still an issue !
Could you please try with the batch_size=32
and test with latest TF version ?
Thanks!
This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.
Closing as stale. Please reopen if you'd like to work on this further.
Hi, I'm trying to regenerate results of a Keras Sequential model to train a network for Human Activity Recognition (A two-stream network, comprising of Spatial and Temporal Streams) on UCF-101 Dataset.
generator is used to load in the data
Sequential Stream is working fine and giving reported accuracies.
While working on the temporal stream, the training took too much time with the default Model and the val_acc flactuates too much.
So I tried to change the Dropout from 0.9 to 0.5 in the dense layers and the results are
By changing the dropout from 0.9 to 0.5 the model converged quite quickly but val_acc stagnates at 50% while accuracy keeps incresing, which is the case of overfitting i guess.
accuracy mentioned by the author is 80%.
What should be the dropout value and how to avoid overfitting and increse val_accuracy
`
Default Model