GilLevi / AgeGenderDeepLearning

628 stars 284 forks source link

Unbalanced folds #37

Closed rauldiaz closed 5 years ago

rauldiaz commented 5 years ago

Hi,

I know that #24 is a closed issue, but I am still curious about how these folds are set up. My results are as follows:

Fold 0 as test set: 62% Fold 1 as test set: 49% Fold 2 as test set: 58% Fold 3 as test set: 48% Fold 4 as test set: 56%

My biggest concern is not just the variation in the test sets (especially 1 and 3), but in the accuracy of the validation sets of each fold. In my experiments, each fold gets easily over 80% of accuracy in the validation sets. It feels as if the unseen data from the test sets is wildly different from the validation sets, thus making hard to tune the model.

Do you have any intuition on why this happens?

Thanks, Raul

GilLevi commented 5 years ago

Hi Raul,

Thank you for your interest in our work.

I can offer one intuition why this might happen: at the core, the problem is that distribution of the test data is different from the distribution of the train and the validation data. This happens since the distribution of each fold is different than the other folds (which is also confirmed by your test results).

I believe that this is because each folder contains different subjects (which introduces a lot of variation between the distribution of the folds, more than the variation of having one subject with different images of which are distributed across the folds). Let's take for example an "imagenet-like" problem of classifying dog breeds. If you have a lot of data samples you can easily split it into folds that have roughly the same data distribution by making sure that each image in one fold has a similar image at the other folds. In our case, do prevent overfitting we do not allow that a subject will appear in more than one fold, so by definition (and since the data is small) it's less likely that for an image in one fold you will find a corresponding image in another fold.

This might be a bit annoying when tuning the model, but it allows for better generalization.

rauldiaz commented 5 years ago

Hi @GilLevi ,

Thanks for explaining!