vlfeat / matconvnet

MatConvNet: CNNs for MATLAB
Other
1.4k stars 753 forks source link

Validation error lower than training error on CIFAR100? #249

Closed okvol closed 8 years ago

okvol commented 9 years ago

Very occasionally, when testing some of my new architectures (without changing cnn_train.m or vl_simplenn.m), I got validation error that is significantly lower than training error? something like 3-5 percent lower on CIFAR 100 all the time during training, as if the training and validation curves are "swapped". The validation objective value is also lower than the training one. Is there a bug in cnn_train.m that could make this happen?? I mean, is it possible for it to somehow "swap" the validation and training error?

In general, I can't imagine how validation error could be lower than training one on such a large dataset. Or am I missing something?

This phenomenon is architecture dependent, which makes it more mysterious... you cannot reproduce it with simple architectures (like the CIFAR example cnn) using the same imdb.

Anybody else saw this phenomenon as well?

lenck commented 9 years ago

Hmm, first I would check if the validation data are not leaking to the training data, which is always scary thing to happen... But in general, validation error can be lower when the validation set is unbalanced and contains simpler examples than the training set? Tbh I do not have much experience with the CIFAR100...

okvol commented 9 years ago

@lenck I haven't modified cnn_train.m and vl_simplenn.m. I don't think the validation could leak into the training data --- unless there is already a bug in these two files...

The purpose of this post is to remind people to keep an eye on the evaluation/ploting code of cnn_train.m, perhaps there is a subtle bug, which might be triggered by some inputs but not others...

peiyunh commented 9 years ago

This can happen in early stages of training, especially if you are using the given training code. Assume we compare both error of epoch N, the validation error reflects the performance of epoch N, while for training, it is the average of all iterations from epoch N-1 to epoch N. Let me know if this answers your question.

vedaldi commented 9 years ago

Hi, yes, this is the most likely explanation. Unfortunately computing the “true” training error would be way too expensive as it would require scanning the training data again.

It may be possible to make this a little better by implementing a moving average to estimate the training error. At present, consider the computed training error to be somewhere in between iteration N-1 and N.

On 24 Aug 2015, at 06:45, eric-phu notifications@github.com wrote:

This can happen in early stages of training, especially if you are using the given training code. Assume we compare both error of epoch N, the validation error reflects the performance of epoch N, while for training, it is the average of all iterations from epoch N-1 to epoch N.

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/249#issuecomment-134034017.

iiwindii commented 8 years ago

@eric-phu hi, I have checked the cnn_train code. however, Assume we compare both error of epoch N, I found both training error and validation error reflect the performance of epoch N......I have not seen the difference in the code.....can you point out where the difference is in cnn_train function ?

vedaldi commented 8 years ago

Hi, my comment simply meant that the training error is computed as the model is updated during an epoch, whereas the validation error is computed on a “frozen” model at the end of an epoch. This is implicit in how the code works.

Let M_{N-1} be the model at epoch N-1 and M_N the model at epoch MN. Let M{N-1} -> M’ -> M’’ -> M’’’ -> …. -> M_N the sequence of intermediate models computed for each mini-batch processed during training. Then

Training error at epoch N = Average of training errors of M’, M’’, M’’’, … on the training set mini-batches Validation error at epoch N = Validation error of model M_N on the validation set

Hence the estimated training error you get is somewhere in between the training error of model M_{N-1} and model M_N, whereas the validation error is for model M_N. Since MN should be better than M{N-1}, then the validation error might be smaller than this estimate (but won’t when N is large as models change little and overfitting dominates).

Note that you could the “proper" training error of M_N after each epoch, but that would be expensive (as it would require freezing M_N and passing all the training data again) and not worth it in practice.

On 11 Nov 2015, at 02:38, fengyunxiaozi notifications@github.com wrote:

@eric-phu https://github.com/eric-phu hi, I have checked the cnn_train code. however, Assume we compare both error of epoch N, I found both training error and validation error reflect the performance of epoch N......I have not seen the difference in the code.....can you point out where the difference is in cnn_train function ?

— Reply to this email directly or view it on GitHub https://github.com/vlfeat/matconvnet/issues/249#issuecomment-155639326.

iiwindii commented 8 years ago

I understand this better now, thanks!

debvratV commented 5 years ago

Hi, my comment simply meant that the training error is computed as the model is updated during an epoch, whereas the validation error is computed on a “frozen” model at the end of an epoch. This is implicit in how the code works.

Let M_{N-1} be the model at epoch N-1 and M_N the model at epoch MN. Let M{N-1} -> M’ -> M’’ -> M’’’ -> …. -> M_N the sequence of intermediate models computed for each mini-batch processed during training. Then

Training error at epoch N = Average of training errors of M’, M’’, M’’’, … on the training set mini-batches Validation error at epoch N = Validation error of model M_N on the validation set

Hence the estimated training error you get is somewhere in between the training error of model M_{N-1} and model M_N, whereas the validation error is for model M_N. Since MN should be better than M{N-1}, then the validation error might be smaller than this estimate (but won’t when N is large as models change little and overfitting dominates).

Note that you could the “proper" training error of M_N after each epoch, but that would be expensive (as it would require freezing M_N and passing all the training data again) and not worth it in practice.

On 11 Nov 2015, at 02:38, fengyunxiaozi notifications@github.com wrote: @eric-phu https://github.com/eric-phu hi, I have checked the cnn_train code. however, Assume we compare both error of epoch N, I found both training error and validation error reflect the performance of epoch N......I have not seen the difference in the code.....can you point out where the difference is in cnn_train function ? — Reply to this email directly or view it on GitHub #249 (comment).

Hi Vedaldi,

Thanks for the explanation. If the training error for epoch N is the average of all the previous, intermediate training errors, then shouldn't the averaging make the training error lower than a validation error? Also, by mentioning the validation error for model M_N, do you mean that the error is computed for the entire validation set, and not in batches?

Thanks,