google-research / mixmatch

Apache License 2.0
1.13k stars 163 forks source link

Comparison of fully supervised models with MixMatch. #32

Closed Shubhammawa closed 4 years ago

Shubhammawa commented 4 years ago

In the experiments mentioned in the paper, MixMatch is trained with a varying number of labelled examples (250 to 4000) and we see that the error rate is very close to a fully supervised model trained using the complete dataset (50000 labelled examples).

However, there is no mention of the error rates of the fully supervised models using less number of labelled examples i.e a comparison for example between the error rates of MixMatch (trained with 250 labelled examples and all other unlabelled examples) and a fully supervised model (using only 250 labelled examples). This comparison would help in determining whether the unlabelled data is actually adding any information or not.

From a practical standpoint, if a fully supervised model trained using only 250 labelled examples achieves an error rate almost equal to the one achieved by MixMatch, we can simply use the fully supervised model.

Would highly appreciate if such a comparison, if done, can be made available. Thanks!

carlini commented 4 years ago

You're definitely right comparing to the fully-supervised baseline is important! This is one of the points of Oliver et al, and the reason we don't report this result in our paper is because they already try to optimize fully-supervised accuracy for few-labeled examples. The best they were able to achieve with 4,000 labeled examples (using a large ShakeShake model, a much better model than our ResNet) was 13.4% error, compared to the 6.24% error of MixMatch---and 11.08% error with just 250 labeled examples.

Shubhammawa commented 4 years ago

Understood. Thanks a lot!