Open omerk1 opened 6 years ago
Hi!
You will probably need to adjust the consistency cost scaling hyperparameter. The CIFAR-10 example uses MSE of softmax which scales differently from cross-entropy as number of classes (see the last appendix of the paper for some math). My guess is that 3000.0 would work instead of 100.0.
Another option is to use KL-divergence instead of MSE. Then the number of classes should not matter and the cost scale between 1.0–10.0 should work.
Thanks for your quick response!
I ran the code with python train_cifar100.py
(my modified vesion of train_cifar100.py), with your suggested change max_consistency_cost = 3000.0
(instead of 100.0).
The two last output lines were:
INFO:main:step 40000: train/error/1: 0.0%, train/class_cost/1: 0.004393, train/cons_cost/mt: 0.055429
INFO:main:step 40000: eval/error/ema: 53.1%, eval/error/1: 57.3% eval/class_cost/1: 3.933695
Do you think it looks reasonable? Thank you again!
Well, at least it does not explode anymore... You may get better results by trying different hyperparam values. The 3000.0 was just an educated guess of where the right value may be. Maybe try also 1000.0 and 10000.0 and possibly intermediate values, and go from there.
I think the usual setting for semi-supervised CIFAR-100 is 10000 labels (100 from each of 1000 classes). I don’t know what’s state of the art at the moment, but the Pi model paper quotes error rate 44.6% when using no unlabeled images and 38.7% when using 10000 labels and the rest of the images without labels. See https://arxiv.org/pdf/1610.02242.pdf
Hi, I ran the code on cifar-10 and it worked great! As this method is the state-of-the-art semi-supervised method for image recognition, I was curious to check its performance on cifar-100. I changed the parts of the code loading cifar-10, to load cifar-100 (Tesorflow version). The code runs and prints indications of training, but I got 98-100% error, So I guess I am doing something wrong. An example of one of the output lines:
Do you have an idea what else should I do to apply it to cifar-100?