Document issue: Channel pruning(GPU) & self-defined model

zheLim commented 5 years ago

After having a glance on channel pruning gpu version, i found that gpu version may not strictly implement lasso regression. Neither Coordinate descent method nor LARS optimization algorithm are used. It will be great if you can add some description about gpu version on algorithm or something else.

BTW, The document of self-defined model is not clear enough.

Execution script name: We must name the execution script as networkname_at_datasetname_run.py. Datasetname must be the same as dataset class, or it would not be Identified by utils/get_path_args.py.
Checkpoint name: Checkpoint of self-defined model must be named as model-xxxx.ckpt instead of model.ckpt directly. Or channel pruning (cpu version) init() will eval function __build_pruned_evaluate_model() since tf.train.checkpoint_exists(path) will recognize this checkpoint and cause error. See learner/channel_pruning/learner.py line 284. Error msg: eval_logits = tf.get_collection('logits')[0] list index out of range. What's more, the checkpoint file which used for indicating latest checkpoint is required since the code use tf.train.lastest_checkpoint() to find the checkpoint.
Checkpoint variable scope: The original checkpoint must have variable scope 'model' else the parameter cannot be restored. See learner/channel_pruning/learner.py line 251.

In all, PocketFlow is a great job and i learn a lot and i am still learning from it :) .

zheLim commented 5 years ago

BTW, CPU version of channel pruning only deals with regular 2D convolution. It cannot process dilation convolution. It would be better if an clarification is add to documents.

jiaxiang-wu commented 5 years ago

@zheLim Thanks for your suggestions.

The GPU version does not implement lasso regression. Actually, it is solving a L2,1-norm regularized optimization problem with proximal gradient descent. The regularization strength is gradually increasing to slowly lift the pruning ratio to the target value. We will provide an independent documentation to describe its algorithm in details.
We will clarify these details in the "self-defined models" documentation.
This will be clarified in the next PR.

zheLim commented 5 years ago

Thanks for reply. Does solving a L2,1-norm regularized optimization problem get better result than lasso regression?

jiaxiang-wu commented 5 years ago

It runs faster under multi-GPU setting, and achieves higher accuracy on some models. We will provide more detailed results in the documentation.

zheLim commented 5 years ago

Thanks a lot :)

jiaxiang-wu commented 5 years ago

Doc:

add documentation for ChannelPrunedGpuLearner;
fix minor issues in "self-defined models" and ChannelPrunedLearner.

GoldenSpark commented 5 years ago

Hey @jiaxiang-wu
How about the accuracy of ChannelPrunerGpuLearner on MobileNet ? Is this a structured pruning algorithm which leads to a regular pattern of sparsity?

jiaxiang-wu commented 5 years ago

For MobileNet-v1, the top-1 accuracies are: 68.5% (50% FLOPs) | 67.8% (40% FLOPs) | 66.3% (30% FLOPs)
ChannelPrunedGpuLearner is a structured-pruning algorithm. The compressed model has regular sparsity patterns.

GoldenSpark commented 5 years ago

@jiaxiang-wu Thanks much!

Tencent / PocketFlow

Document issue: Channel pruning(GPU) & self-defined model #111