Slight performance difference when training from scratch

shwangtangjun commented 3 years ago

I cannot reproduce the results if I remove the --evaluate option. Below are the results of backbone ResNet18 on miniImageNet, there is about 2% performance difference.

Meta Test: LAST feature UN L2N CL2N GVP 1Shot 0.6461(0.0020) 0.6531(0.0020) 0.7033(0.0019) GVP_5Shot 0.7381(0.0020) 0.7313(0.0020) 0.8029(0.0015)

Meta Test: BEST feature UN L2N CL2N GVP 1Shot 0.6469(0.0020) 0.6547(0.0020) 0.7036(0.0019) GVP_5Shot 0.7367(0.0020) 0.7305(0.0020) 0.8018(0.0015)

Full log file in https://gist.github.com/shwangtangjun/7651009fac72290102732e5650e587c4

If I download your pretrained network to evaluate, the results are identical to those in your paper. So the problem must lie in the training process on base classes. I have not changed any parameter except deleting --evaluate. Could you recheck your .config file? Or could you provide your log file during training.

imtiazziko commented 3 years ago

Hi @shwangtangjun , I am not sure what makes the difference. However, I was checking my logs and I see that during training, I did not fix the seed = 1 instead it was seed =None. So It means I actually did not fix the seed during training. I am not sure if it is the only issue.

shwangtangjun commented 3 years ago

Thanks for your reply. I'll try different seed.

imtiazziko commented 3 years ago

Hi @shwangtangjun, I just checked in my logs for training and I see that the jitter is always set to False in my training and that makes the difference. I am really sorry for this mistake in the implementation details of the paper. I did not actually use jitter in my training. So I updated the option setting 'jitter' to False. I hope you can reproduce the results now by training from scratch.

I just updated it here. Thanks alot for notifying it.

shwangtangjun commented 3 years ago

Thanks for your reply. Unfortunately, after I have changed seed to None, and jitter to False, the result is still about 2% lower.

LAST feature UN L2N CL2N GVP 1Shot 0.6433(0.0020) 0.6531(0.0020) 0.7008(0.0020) GVP_5Shot 0.7397(0.0020) 0.7305(0.0020) 0.8100(0.0015)

BEST feature UN L2N CL2N GVP 1Shot 0.6443(0.0019) 0.6598(0.0019) 0.6886(0.0019) GVP_5Shot 0.7421(0.0018) 0.7490(0.0017) 0.7957(0.0015)

log file: https://gist.github.com/shwangtangjun/b31e0e01afbc378045d2274f65762c1f

However, I notice that increasing training epochs can improve the performance. If I set epoch to 150 or 180, final performance can be similar to that in the paper.

imtiazziko commented 3 years ago

Hi @shwangtangjun ,

Ok Here is what I have found. Apparently, I have used the model trained according to simpleshot paper (No Label Smoothing and Jitter) for resenet 18 for miniImagenet while for others I added Label smoothing And That makes the difference. So here are the logs while used without label smoothing for resnet 18 in miniImageNet:

Without Label smoothing: https://gist.github.com/imtiazziko/da86dc1796780786032504853dedfd29 With jitter and label smoothing: https://gist.github.com/imtiazziko/fb60e4242174bcf5a0961b739c88bda0

I check that the result reported in the paper for the resnet18 in mini imagenet is actually from the model from simpleshot training. However, For others I actually used label smoothing.
However, the boost of LaplacianShot is not hurted irrespective of model training as it is used only during inference. So all good and sorry for the inconvenience for resnet 18 in mini.

shwangtangjun commented 3 years ago

Hi @imtiazziko

Yep, I run with label_smooth=0, jitter= False and get expected results. https://gist.github.com/shwangtangjun/2718637a326be36651cc8fa2d2e12e89

However, if adding label smoothing actually hurts the performance on [miniImageNet, resnet18], why do you choose adding smoothing on other datasets or backbones?

imtiazziko commented 3 years ago

Hi @shwangtangjun,

This is because while not for resnet/mini, but for others I have found adding jitter and label smooth in most cases actually improves the performance. If you check in our paper, we actually reported the results of simpleshot with our trained model which is better than the original simpleshot paper results. So that is the reason I remember. But yeah inconsistency can happen which in this case is resnet/mini as label-smoothing is not predictable some times. I could have reported both result actually.

Again, irrespective of model training, we actually see the LaplacianShot inference is helping over the baselines.

But anyway, Thanks a lot for notifying.

shwangtangjun commented 3 years ago

I understand. I'll try to do some ablation study if I have time.

Thanks for your effort.

imtiazziko / LaplacianShot

Slight performance difference when training from scratch #7