Question about weight_sparsification pruning

Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

https://pocketflow.github.io

Other

2.78k stars 490 forks source link

Question about weight_sparsification pruning #240

Closed fanghuaqi closed 5 years ago

fanghuaqi commented 5 years ago

Hi, I have question about pruning, did the pruning implemented in pocketflow using pre-trained weights as training base, and it is not training from scratch?

Or is this correct?

the pretrained model is used to identify the pruning ratio. But the training part does not take the weight values from the pretrained model(training from scratch).

I looked at the documentation, but didn't find any information about this?

Thanks Huaqi

yuanyuanli85 commented 5 years ago

The pretrained model is used in optimal mode to get best pruning ratio from ddpg. The training does not use the pretrained model. If you are using "uniform" or others, pretrained model never be used at all.

fanghuaqi commented 5 years ago

@yuanyuanli85 Thanks for your answer, but I am wondering why training don't use pre-trained model, it should be easier to fine tuning from a pre-trained model than from scratch?

Will this project in future consider to use pre-trained model as pruning start base?

jiaxiang-wu commented 5 years ago

@fanghuaqi It is surely possible to use a pre-trained model for warm-start, instead of training from scratch. We may add support for this in the near future. You may also submit a PR to implement this feature (and this is highly welcomed!).

fanghuaqi commented 5 years ago

@jiaxiang-wu If we want to implement this feature, what process should we follow? Is there already warm-start code in the PocketFlow, I mean like stub-codes, so developer can fill in it? Or we need to extend it from scratch? I think this feature might be needed for all the learners, it will certainly decrease training time somehow than training from scratch.

Thanks Huaqi

jiaxiang-wu commented 5 years ago

@fanghuaqi You can firstly try to implement the warm-start feature for the weight sparsification learner only. It may be easier to start with.