why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations?

dvornikita / SUR

Code for the paper "Selecting Relevant Features from a Universal Representation for Few-shot Classification"

MIT License

41 stars 11 forks source link

why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations? #4

Open one23sunnyQQ opened 3 years ago

one23sunnyQQ commented 3 years ago

Hi, Thank you for sharing your code. But why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations？May I ask how the test results provided in your paper can be determined as the final result when the performance fluctuates so much？ Looking forward to your reply.

dvornikita commented 3 years ago

Hi, I am sorry for the inconvenience. I just realized that while debugging the code before the release I set the number of test examples to 5 (instead of 600) and forgot to set it back to 600 before pushing the code. This caused such fluctuation in the final performance.

This is now fixed and you can update the code and run it again.

tangminfang commented 3 years ago

Hi, Thank you for sharing your code. But why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations？May I ask how the test results provided in your paper can be determined as the final result when the performance fluctuates so much？ Looking forward to your reply.

Hi, I have the same problem as you. But I ran the test.py with the author's latest code. The gap was much larger than the official announcement. May I ask what is the latest development of your question? Did you solve it? Thanks in advance! ^^

sudarshan1994 commented 3 years ago

Yeah I seem to have gotten similar results

100%	█████████████████████████████████████████████████████████████████████████████████████████	600/600 [10:13<00:00, 1.02s/it] model \ data SUR

ilsvrc_2012 47.50 +- 0.00 omniglot 88.79 +- 0.00 aircraft 88.75 +- 0.00 cu_birds 85.56 +- 0.00 dtd 76.67 +- 0.00 quickdraw 87.27 +- 0.00 fungi 87.50 +- 0.00 vgg_flower 88.00 +- 0.00 traffic_sign 50.87 +- 0.00 mscoco 59.23 +- 0.00

Any help debugging this would be greatly appreciated, thanks!