CuriousAI / mean-teacher

A state-of-the-art semi-supervised method for image recognition
https://arxiv.org/abs/1703.01780
Other
1.59k stars 336 forks source link

Applying the code to cifar-100 #12

Open omerk1 opened 6 years ago

omerk1 commented 6 years ago

Hi, I ran the code on cifar-10 and it worked great! As this method is the state-of-the-art semi-supervised method for image recognition, I was curious to check its performance on cifar-100. I changed the parts of the code loading cifar-10, to load cifar-100 (Tesorflow version). The code runs and prints indications of training, but I got 98-100% error, So I guess I am doing something wrong. An example of one of the output lines:

INFO:main:step 60: train/error/1: 100.0%, train/class_cost/1: nan, train/cons_cost/mt: nan

Do you have an idea what else should I do to apply it to cifar-100?

tarvaina commented 6 years ago

Hi!

You will probably need to adjust the consistency cost scaling hyperparameter. The CIFAR-10 example uses MSE of softmax which scales differently from cross-entropy as number of classes (see the last appendix of the paper for some math). My guess is that 3000.0 would work instead of 100.0.

Another option is to use KL-divergence instead of MSE. Then the number of classes should not matter and the cost scale between 1.0–10.0 should work.

omerk1 commented 6 years ago

Thanks for your quick response! I ran the code with python train_cifar100.py (my modified vesion of train_cifar100.py), with your suggested change max_consistency_cost = 3000.0 (instead of 100.0).

The two last output lines were:

INFO:main:step   40000:   train/error/1: 0.0%,  train/class_cost/1:        0.004393,  train/cons_cost/mt:        0.055429
INFO:main:step   40000:   eval/error/ema: 53.1%,  eval/error/1: 57.3% eval/class_cost/1: 3.933695

Do you think it looks reasonable? Thank you again!

tarvaina commented 6 years ago

Well, at least it does not explode anymore... You may get better results by trying different hyperparam values. The 3000.0 was just an educated guess of where the right value may be. Maybe try also 1000.0 and 10000.0 and possibly intermediate values, and go from there.

I think the usual setting for semi-supervised CIFAR-100 is 10000 labels (100 from each of 1000 classes). I don’t know what’s state of the art at the moment, but the Pi model paper quotes error rate 44.6% when using no unlabeled images and 38.7% when using 10000 labels and the rest of the images without labels. See https://arxiv.org/pdf/1610.02242.pdf