Closed SCIKings closed 2 months ago
@SCIKings As experimented in our paper, supervised pretraining (ImageNet pretrained) generally provides a better initialization. In addition, all pretrained weights in the paper are obtained via timm (https://github.com/huggingface/pytorch-image-models), while our method should work on any pretrained weights.
So to answer your question, timm has ResNet and VGG weights of different sizes publicly available. If you have your own pretrained ResNet or VGG, they should also work.
Thank you very much for your reply! Even though I read this innovative article carefully, there are still some things I don't understand. For instance, is the main idea of the article to choose a large size of pre-trained weights and then get a size that can match the small network by sampling evenly? If not, then how should we determine the size of the uniformly sampled weights?
Hi @SCIKings ,
Yeah I think you understand correctly. We use a subset of a larger model's pretrained weights to initialize a smaller model. Both the larger model and the small model's configuration (size) are pre-determined.
It's a very valuable job. If I want to apply this initialization method to Resnet and VGG networks, which pre-training weights would you suggest as initial weights?