Question about the model architecture (the batchnorm)

jihoontack commented 2 years ago

Hi, again :)

Thank you for the wonderful work.

While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)

Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.

Thank you very much for your time! Best, Jihoon

Han-Jia commented 2 years ago

Hi, Jihoon,

We do not use transductive batchnorm in the paper, and we reset the batchnorm for each task during the meta-test.

I think the learning rate and step size matter when using the pre-trained weights. A relatively larger learning rate and step size may help.

best, Han-Jia

jihoontack commented 2 years ago

Thank you very much for your reply!

Really appreciate it :)

Best, Jihoon

Han-Jia / UNICORN-MAML

Question about the model architecture (the batchnorm) #2