yiwenguo / Dynamic-Network-Surgery

Caffe implementation for dynamic network surgery.
Other
186 stars 70 forks source link

Lifecycle of using Dynamic-Network-Surgery #12

Closed JosephKJ closed 7 years ago

JosephKJ commented 7 years ago

Hi @yiwenguo,

Thankyou so much for sharing your work with the community!

I do have a caffemodel that is trained on my own dataset. It follows an architecture that is similar to Resnet 101, but has some extra layers. I am about to use DNS to see how much I am able to compress it.

This is how I plan to go about. Can you please confirm whether my understanding on how to use the DNS is correct or not:

INPUT: My trained model (caffemodel and prototxt) Step 1: Modify the convolution layers in my prototxt to CConvolution and fc layers to CInnerProduct and pass appropriate messages to the modified layers. Step 2: Finetune my network for some iterations. caffe train -solver my_modified.prototxt -weights my_trained_model.caffemodel Step 3: Do the post processing that you mentioned here on the caffe model that gets generated in Step 2. OUTPUT: Smaller caffemodel, from step 3

Thanks, Joseph

yiwenguo commented 7 years ago

Hi @JosephKJ , I think you've got the correct pipeline for using DNS (and also some other related pruning methods). Just be careful that, as you are trying to compress an extremely deep neural network, it would better to divide the layers into groups and run DNS several times for compressing them (i.e., the groups) one by one.