kjunelee / MetaOptNet

Meta-Learning with Differentiable Convex Optimization (CVPR 2019 Oral)
Apache License 2.0
529 stars 97 forks source link

Couldn't repeat your results [Instance Norm/Batch Norm] #38

Closed lucastononrodrigues closed 4 years ago

lucastononrodrigues commented 4 years ago

Hello, I've been trying to repeat your results for a bachelor project where I will be applying your algorithm in different meta datasets.

I obtained lower accuracies, about 5 to 10 % for each meta dataset you used (except for FC100 where I have obtained it relatively close accuracy). I made some changes in your code in order to fit my gpu memory (lowered the amount of batches per episode to 4). My first guess was that this could make a difference mainly in the Batch Normalization in ResNet so I changed it to Instance Normalization instead, as a result I've obtained even lower results.

Would you help me understand why? Other than that, would you have any suggestions of meta-datasets we could use?

Thank you in advance. Lucas Tonon Rodrigues.

kjunelee commented 4 years ago

In order to reduce memory consumption, I recommend reducing --train-shot (change it to for example 15 to 10) rather than --episodes-per-batch.

From my experience, batch size matters a lot in SGD.

Also, It is empirically verified in Group Normalization paper (https://arxiv.org/abs/1803.08494) that instance normalization is suboptimal for image classification.

lucastononrodrigues commented 4 years ago

Thank you very much for the explanation, I will do that and keep you in touch. We are applying your work to other datasets at the moment, we will share the results if possible.