Pre-training weight for model compression

OscarXZQ / weight-selection

164 stars 11 forks source link

Pre-training weight for model compression #2

Closed SCIKings closed 2 months ago

SCIKings commented 8 months ago

It's a very valuable job. If I want to apply this initialization method to Resnet and VGG networks, which pre-training weights would you suggest as initial weights?

OscarXZQ commented 8 months ago

@SCIKings As experimented in our paper, supervised pretraining (ImageNet pretrained) generally provides a better initialization. In addition, all pretrained weights in the paper are obtained via timm (https://github.com/huggingface/pytorch-image-models), while our method should work on any pretrained weights.

So to answer your question, timm has ResNet and VGG weights of different sizes publicly available. If you have your own pretrained ResNet or VGG, they should also work.

SCIKings commented 8 months ago

Thank you very much for your reply! Even though I read this innovative article carefully, there are still some things I don't understand. For instance, is the main idea of the article to choose a large size of pre-trained weights and then get a size that can match the small network by sampling evenly? If not, then how should we determine the size of the uniformly sampled weights?

OscarXZQ commented 8 months ago

Hi @SCIKings ,

Yeah I think you understand correctly. We use a subset of a larger model's pretrained weights to initialize a smaller model. Both the larger model and the small model's configuration (size) are pre-determined.