Some question about Proto-MAML

Haoqing-Wang commented 3 years ago

Good job! I find an interesting phenomenon that the accuracy of ProtoNet's linear-layer form (w=2c, b=-c^2) is significantly lower than that of ProtoNet. However, theoretically the accuracy of the two should be the same. Do you have such a discovery? what is the reason?

ArnoutDevos commented 3 years ago

Thanks for checking out our work @Haoqing-Wang!

I find an interesting phenomenon that the accuracy of ProtoNet's linear-layer form (w=2c, b=-c^2) is significantly lower than that of ProtoNet. However, theoretically the accuracy of the two should be the same. Do you have such a discovery? what is the reason?

Yes, when not finetuned, original ProtoNet and the linear-layer form are equal in the predictions that they make. However, in our approach (ProtoTransfer) we do not only initialize the final linear layer with a prototypical form (= ProtoNet), but we also finetune it and (optionally) the deep backbone (= ProtoTune).

In our paper (arXiv, PDF), table 2 (UMTRA-ProtoNet vs UMTRA-ProtoTune) and table 3 (ProtoCLR + ProtoNet vs ProtoTune) show that this strategy is particularly beneficial when there is a relatively high number of shots per class (>5) available. Note that the same tables also show that in a very few shot regime, due to finetuning, performance may slightly degrade. This is a trade-off we made when designing our approach.

Hope it helps!

ArnoutDevos commented 3 years ago

Closing this soon, if no further comment is required.

indy-lab / ProtoTransfer

Some question about Proto-MAML #5