vlfeat / matconvnet

MatConvNet: CNNs for MATLAB
Other
1.4k stars 753 forks source link

Problem with validation error #335

Open Linda72 opened 8 years ago

Linda72 commented 8 years ago

Hi I'm new to deep learning and CNN. I have intended to use matconvnet for my class project about facial age estimation. I have a training set of size 8000 face images and a validation set of size 1000. At first 15 epochs, both the validation and training error decrease but after that just the training error decreases and the validation error alternatively increases and decreases.I observed it until 40th epoch. Here are my layers : Conv Relu Pool Conv Relu Pool Conv Relu Pool Conv Relu Dropout Softmaxloss And I have set the learning rate to 0.001 I really do not know where I am doing wrong. I would appreciate any help. Thanks Linda

lenck commented 8 years ago

Hi, you are apparently over-fitting... This generally happens when you have too many parameters and too little training data, which is probably your case and is generally curse of deep learning. And to find a new architecture is in general rather difficult task. In general, what I would suggest is to start with fine-tuning pre-trained architecture (eg. in your case VGG deep face). If you want to learn your own architecture though, I found these two articles really useful: Practical recommendations for gradient-based training of deep architectures Stochastic Gradient Descent Tricks

vedaldi commented 8 years ago

Hi, since this is probably not an “issue” with the library, could this be moved to the new discussion forum?

Thanks!

On 13 Dec 2015, at 16:49, Karel Lenc notifications@github.com wrote:

Hi, you are apparently over-fitting... This generally happens when you have too many parameters and too little training data, which is probably your case and is generally curse of deep learning. And to find a new architecture is in general rather difficult task. In general, what I would suggest is to start with fine-tuning pre-trained architecture (eg. in your case VGG deep face). If you want to learn your own architecture though, I found these two articles really useful: Practical recommendations for gradient-based training of deep architectures http://arxiv.org/abs/1206.5533 Stochastic Gradient Descent Tricks http://research.microsoft.com/pubs/192769/tricks-2012.pdf — Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/335#issuecomment-164276806.