Some questions about Channel Pruning Auto

Tencent / PocketFlow

An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.

https://pocketflow.github.io

Other

2.78k stars 490 forks source link

Some questions about Channel Pruning Auto #107

Closed buryang closed 5 years ago

buryang commented 5 years ago

Hi, I‘m trying to compress my Unet network which for regression. I just use 1-lambda*l2_loss for Acc, and till now trained 40 roll outs，the Acc reach only 0.22 but the Acc of original model is 0.96， is it normal? In addition, the hidden layer size in the AMC paper is 300, why is your code changed to 64? Then sckit-learn training is slow, can you save the intermediate checkpoints of actor-critic models?

jiaxiang-wu commented 5 years ago

@psyyz10 Could you please take a look at this issue?

psyyz10 commented 5 years ago

@buryang

If you choose the auto mode, rl agent will start to learn from the 50 roll out. Hence before the 50 roll out, the agent will random choose split ratio. Or you can set --cp_nb_rlouts_ming to other value.
If you choose the auto model of channel pruning, please set --ddpg_noise_std_init=0.5.
After 200 roll outs, the network will be retrained, so the Acc will be improved again.
What's your hidden layer size in AMC means? Our approach may be different from AMC.
You can save the intermediate results by setting --cp_nb_rlouts to a smaller value.

buryang commented 5 years ago

AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

psyyz10 commented 5 years ago

@buryang I know that paper, but what the hidden layer size means? Maybe we do not use that variable.

JinyangGuo commented 5 years ago

@psyyz10

Hi, thanks for your open-source code.

In the ChannelPruning Auto, which channel pruning method are you using? The original AMC paper prune the channels in the convolutional kernel with lowest magnitude. It seems that you use LASSO to select the channels. Please correct me if I misunderstand the code.
In the channel-pruning-gpu repo, you implement the discrimination-aware channel pruning. Do you have the implementation of LASSO based channel pruning on GPU?

Thanks.

jiaxiang-wu commented 5 years ago

@Jinyyyy

Yes, we use LASSO to select channels. The reference paper is: Yihui He, Xiangyu Zhang, Jian Sun. Channel Pruning for Accelerating Very Deep Neural Networks, ICCV 2017.
For now, no, we do not have a GPU implementation of LASSO-based channel pruning. The most relevant one may be ChannelPrunedGpuLearner, which uses proximal gradient descent to select channels under the L1-regularization and is implemented on GPU.