Closed whyygug closed 2 years ago
Hello,
batch_fs_size
doesn’t change the test performance. It’s unique purpose is to fit the data on the gpu. There is no gradient update during the few shot evaluation and the runs entirely parallel. The larger its value, the faster the testing will be. As long as your gpu can fit the data.dataset
to ‘tieredimagenet’ and ‘cubfs’.I hope this answers your concerns. Feel free to ask if something isn't clear.
Best,
Thanks for your reply. I have some other questions.
I tried to replace the SGD optimizer with the Adam optimizer, but this caused almost no change in performance. Are my experimental results consistent with yours?
In your paper, you reported the results of S2M2R with the backbone of WRN-28-10. If I want to use WRN-28-10 as the backbone, should I use "wideresnet"
or "s2m2r"
? Do I need to modify other arguments, such as changing "feature_maps"
to 16? Also, what is the difference between "wideresnet"
and "s2m2r"
?
Best,
Hello,
"s2m2r"
uses the same classifier used in the s2m2r paper which is a modified version of the standard logit classifier. So if you want to just a WRN-28-10backbone instead of a ResNet12 with the usual logit classifier, you should use "wideresnet"
as an option.
If your purpose is to reproduce s2m2r paper's results, you need to use the same hyperparmeters as theirs which I don't remember exactly, use adam with 400 epochs without mixup then use mixup after that for >400 epochs.
Lastly, you should change "features_maps"
to 16. Let me know if you manage to make it work or if you have any questions.
Best,
Hello,
I made a mistake in my previous statement about the Adam optimizer.
At that time the training was not yet finished and I misjudged that there was not much difference between the two optimizers based on the initial performance of the model, but today after the training I found that directly replacing SGD with Adam optimizer leads to serious performance degradation.
According to my re-implementation, the model with SGD shows 67% on 1-shot and 83.7% on 5-shot, which similar to the results in your paper, but the model with Adam ("--lr = -0.1
") shows 52% on 1-shot and 69% on 5-shot. Maybe replacing SGD directly with Adam is not a appropriate choice.
I don't have any more questions at the moment, thank you very much for your support. Best,
When running the experiment, how to enter the 3 feature file paths。 --test-features "[/minifeatures1.pt11, /minifeatures2.pt11, /minifeatures3.pt11]" when input -test-features "[~/lishuai/hhh/minifeatures1.pt11, ~/lishuai/hhh/minifeatures2.pt11, /lishuai/hhh/minifeatures3.pt11]" alway output FileNotFoundError: [Errno 2] No such file or directory: '[/lishuai/hhh/minifeatures1.pt11, ~/lishuai/hhh/minifeatures2.pt11, ~/lishuai/hhh/minifeatures3.pt11]' what's wrong? how to resolve?
Hello, thank you very much for your work, I have some questions.
"batch_fs_size"
toargs.py
and its default value is set to 20, does this argument affect the test performance? From my understanding, it seems to affect only the GPU memory usage and speed during testing.if mixup and args.mm: ......
) https://github.com/ybendou/easy/blob/85937e0d2d67a801dba7a96974a79c2d6cad86b7/main.py#L84-L96 but this resulted in a slight performance degradation, is this because there is a potential conflict between manifold mixup and rotation?Sorry for my poor English, I hope I expressed my question clearly. Best,