HobbitLong / CMC

[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis
BSD 2-Clause "Simplified" License
1.3k stars 179 forks source link

Convention for number of convolutions in AlexNet #11

Open macaodha opened 5 years ago

macaodha commented 5 years ago

Hi there,

This is a bit of a meta-question.

I noticed that your code uses the original AlexNet parameters i.e. with convolutions 96,256,384,384,256 vs. the one weird trick paper 64,192,384,256,256 that is the standard in the official PyTorch implementation.

In comparison, Feng et al. at CVPR 2019 use the smaller version of AlexNet in their code.

I was wondering if there was a standard for which version of AlexNet should be used in the self-supervised literature, and if it even makes a difference?

Thanks

HobbitLong commented 5 years ago

Hi, @macaodha ,

Good question! I think most of the previous literature used the so-called CaffeNet from the original caffe team, which should be 96,256,384,384,256.

Here my model is also not the standard CaffeNet, in the sense that we split it into two parts. This will halve the number of parameters as well.

Have you tried both of the two models you mentioned to see the performance gap? I feel it would not be very big but I haven't got a chance to figure it out yet.

macaodha commented 5 years ago

Thanks for the reply!

I'll let you know if I look into this.