Open macaodha opened 5 years ago
Hi, @macaodha ,
Good question! I think most of the previous literature used the so-called CaffeNet from the original caffe team, which should be 96,256,384,384,256.
Here my model is also not the standard CaffeNet, in the sense that we split it into two parts. This will halve the number of parameters as well.
Have you tried both of the two models you mentioned to see the performance gap? I feel it would not be very big but I haven't got a chance to figure it out yet.
Thanks for the reply!
I'll let you know if I look into this.
Hi there,
This is a bit of a meta-question.
I noticed that your code uses the original AlexNet parameters i.e. with convolutions 96,256,384,384,256 vs. the one weird trick paper 64,192,384,256,256 that is the standard in the official PyTorch implementation.
In comparison, Feng et al. at CVPR 2019 use the smaller version of AlexNet in their code.
I was wondering if there was a standard for which version of AlexNet should be used in the self-supervised literature, and if it even makes a difference?
Thanks