Closed kumam92 closed 3 years ago
Hi,
One reason could be that the Transformer compensates for the noise in different tasks. So when using the ProtoNet, the deeper network could memorize the noise in the base class set and cannot generalize to the UNSEEN data.
I am using your FEAT model for my own dataset which is quite noisy. I found that when I use FEAT model ResNet 12 is better than ConvNet but when I use same prototypical network as in your code I found that ConvNet outperforms ResNet12 by 3%. What could be the reason for that FEAT best accuracy is through deeper network and prototypical network works better with shallow network?