Open EvgeniaAR opened 3 years ago
Sorry for the problems. Have you tried using stl instead of stl10 (as you discovered) for target?
Yu
thanks for the fast response :).
using "stl" instead of "stl10" doesn't work because the dataloader wants the argument to have "stl10" ;). I just changed the line where it wants to set the number of classes to 9 to saying "stl10" and then it worked.
Could you please comment on the mismatching values if running the code?
My question was meant to ask, what are the results of after you have fixed the target argument? Or if those are, then what are the originals?
oh sorry! the results I got were with the original code, i.e. with 10 classes. I tried fixing the bug and using 9 classes, but did not get better results this way :(.
I dug up a plot from more than two years ago when I did this paper. Does your plot look similar? The red and magenta lines are rotation only and quadrant only. Sorry I never spent enough time on the codebase since the paper was never published.
thanks for the effort :). I think you are using other hyperparameters here. When one just runs the code, one gets the following curves: cifar-stl-r.pdf cifar-stl-rqf.pdf
It seems that in my case, the learning rate decay doesn't have the same effect as in your case. It's a bit weird, and I definitely did not disable the scheduler :(.
You don't happen to know which hyperparameters the curve was obtained with that you posted above?
It seems like the only difference is the width hyper-parameter, which is 8 in the script that I think was ran at the time. This was likely because I didn't have enough time to run it, but maybe 8 is the magic number...
I ran the script with width=8 and both 9 and 10 classes and here are the results:
datasets | num_classes | error |
---|---|---|
cifar-stl-r | 9 | 32.1% |
cifar-stl-r | 10 | 30.3% |
cifar-stl-rqf | 9 | 28.0% |
cifar-stl-rqf | 10 | 27.4% |
The results are for choosing the epoch with the MMD+source error heuristic. I think the results with width=8 are actually worse than with width=16 :(.
First, setting the number of classes to 9 here is not triggered because the "target" set in the bash script says "stl10" and not "stl". Therefore, the number of classes in the model is still 10. This likely results in worse performance and is not intended.
Further, I ran the provided scripts aiming to reproduce the numbers in the paper Table for CIFAR10->STL. However, they do not match:
I ran the scripts here as given.
Output from show_table.py cifar_stl source only accuracy: 68.28 output/cifar_stl_r/loss.pth best accuracy: 74.67 output/cifar_stl_r/loss.pth mmd select accuracy: 69.96 output/cifar_stl_rq/loss.pth best accuracy: 75.96 output/cifar_stl_rq/loss.pth mmd select accuracy: 72.17 output/cifar_stl_rqf/loss.pth best accuracy: 76.81 output/cifar_stl_rqf/loss.pth mmd select accuracy: 73.97
Expected performance from the paper Table: R: 81.2 RLF: 82.1
This is a very large mismatch. We would like to use your model+method in our ICLR submission. However, we can only do so if we can reproduce the numbers from your paper :(. Am I doing something wrong? :(