Open hzhyhx1117 opened 5 years ago
Could you please be more specific? Which model and learner combination are you using?
uniform-quantization and nonuniform-quantization learner on resnet-50 , imagenet2012
@haolibai Any suggestions?
Make sure that you first pretrain a full-precision model, based on which you perform quantization and fine-tuning.
1) I've used the pre-train model provided by the pocketflow doc ResNet-50 | 75.97% | 92.88% and got error
Restoring from checkpoint failed. This is most likely due to a mismatch between the current graph and the graph from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint.
Assign requires shapes of both tensors to match. lhs shape= [256] rhs shape= [128]
[[Node: model/save/Assign_8 = Assign[T=DT_FLOAT, _class=["loc:@model/resnet_model/batch_normalization_10/beta"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](model/resnet_model/batch_normalization_10/beta, model/save/RestoreV2/_61)]]
[[Node: model/save/RestoreV2/_154 = _Send[T=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_116_model/save/RestoreV2", _device="/job:localhost/replica:0/task:0/device:CPU:0"](model/save/RestoreV2:55)]]
2) I also tried to train the model from scratch but use learner uniform or nonuniform directly, and can't get converge. Must I train a full-precision model first and then quantization , fine-tuning on that?
@hzhyhx1117 @jiaxiang-wu @haolibai @miloyip @kirozhao Hi,when I train a full-precision model,the loss doesn`t decline,Could you give any help?
@hzhyhx1117 What is your batch size for training? It is recommended to restore a pre-trained model and fine-tune with quantization constraints, rather than training from scratch.
@huxianer
A full-precision model? What learner are you using?
P.S.: Please do not
@jiaxiang-wu batch size is 64 or 128。Exactly, the model can get converge, but the top5-acc is less than 80%, even less than 75% sometime,how can I make the top5-acc get more higher
@hzhyhx1117 It seems that the batch size does not match with the pre-trained model. Neither 64 nor 128 cannot work? Can you post the error message for batch size of 64 and 128?
@jiaxiang-wu Hi,how to train with a pretrained model in full_prec,the guide file doesn`t say~
@huxianer Which model are you training? We have already provided several pre-trained models here, and if you model is included, then you do not need to train it by yourself. https://api.ai.tencent.com/pocketflow/list.html
If you do need to train a full-precision model by yourself, use:
$ ./scripts/run_local.sh nets/resnet_at_cifar10_run.py --learner full-prec
@jiaxiang-wu Yes,I have used this command,and my pretained model is under the ./models ,but it still train from 0 epoch and the accuracy is very slow,I think something is wrong~
@huxianer Please post the full log please (for training with UniformQuantTFLearner
, not FullPrecLearner
).
@jiaxiang-wu I think you don`t get my meaning,I want to get a better model,then I use the model you offer,how can I finetune with it~
@huxianer Which model and dataset are you fine-tuning with? With or without compression?
@jiaxiang-wu models_resnet_20_at_cifar_10.tar.gz,which you offer,I want to finetune with full_prec,then do the other thing next~
@huxianer models_resnet_20_at_cifar_10.tar.gz
is already a pre-trained model, so why do you need to fine-tune it on the same dataset, without any compression?
@jiaxiang-wu Yes,I want to do it,how to finetune with a pretrained model with this framework,Thanks~
@huxianer Sorry, we do not support this feature (fine-tune a pre-trained model on the same dataset, without any model compression methods).
@jiaxiang-wu emmm.... The point is how to get more top5-acc. I can only get 60 top5-acc when using uniform-quantization learner, and get around 75 top5-acc when using channel-prune-gpu learner, both by start with a pre-train model。Any suggestions about make the top5-acc higher?
@hzhyhx1117 What are the configurations for the uniform quantization learner and channel-prune learner?
@haolibai When channel-prune-gpu, the command is
sh scripts/run_local.sh nets/resnet_at_ilsvrc12_run.py --learner chn-pruned-gpu --resnet_size 50 --dcp_nb_stages 4
The params is
INFO:tensorflow:FLAGS:
INFO:tensorflow:data_disk: local
INFO:tensorflow:data_hdfs_host: None
INFO:tensorflow:data_dir_local: /mnt/kubernetes/ImageNet
INFO:tensorflow:data_dir_hdfs: None
INFO:tensorflow:cycle_length: 4
INFO:tensorflow:nb_threads: 8
INFO:tensorflow:buffer_size: 1024
INFO:tensorflow:prefetch_size: 8
INFO:tensorflow:nb_classes: 1001
INFO:tensorflow:nb_smpls_train: 1281167
INFO:tensorflow:nb_smpls_val: 10000
INFO:tensorflow:nb_smpls_eval: 50000
INFO:tensorflow:batch_size: 64
INFO:tensorflow:batch_size_eval: 100
INFO:tensorflow:resnet_size: 50
INFO:tensorflow:nb_epochs_rat: 1.0
INFO:tensorflow:lrn_rate_init: 0.1
INFO:tensorflow:batch_size_norm: 256.0
INFO:tensorflow:momentum: 0.9
INFO:tensorflow:loss_w_dcy: 0.0001
INFO:tensorflow:model_http_url: https://api.ai.tencent.com/pocketflow
INFO:tensorflow:summ_step: 100
INFO:tensorflow:save_step: 10000
INFO:tensorflow:save_path: ./models/model.ckpt
INFO:tensorflow:save_path_eval: ./models_eval/model.ckpt
INFO:tensorflow:enbl_dst: False
INFO:tensorflow:enbl_warm_start: False
INFO:tensorflow:loss_w_dst: 4.0
INFO:tensorflow:tempr_dst: 4.0
INFO:tensorflow:save_path_dst: ./models_dst/model.ckpt
INFO:tensorflow:ddpg_actor_depth: 2
INFO:tensorflow:ddpg_actor_width: 64
INFO:tensorflow:ddpg_critic_depth: 2
INFO:tensorflow:ddpg_critic_width: 64
INFO:tensorflow:ddpg_noise_type: param
INFO:tensorflow:ddpg_noise_prtl: tdecy
INFO:tensorflow:ddpg_noise_std_init: 1.0
INFO:tensorflow:ddpg_noise_dst_finl: 0.01
INFO:tensorflow:ddpg_noise_adpt_rat: 1.03
INFO:tensorflow:ddpg_noise_std_finl: 1e-05
INFO:tensorflow:ddpg_rms_eps: 0.0001
INFO:tensorflow:ddpg_tau: 0.01
INFO:tensorflow:ddpg_gamma: 0.9
INFO:tensorflow:ddpg_lrn_rate: 0.001
INFO:tensorflow:ddpg_loss_w_dcy: 0.0
INFO:tensorflow:ddpg_record_step: 1
INFO:tensorflow:ddpg_batch_size: 64
INFO:tensorflow:ddpg_enbl_bsln_func: True
INFO:tensorflow:ddpg_bsln_decy_rate: 0.95
INFO:tensorflow:ws_save_path: ./models_ws/model.ckpt
INFO:tensorflow:ws_prune_ratio: 0.75
INFO:tensorflow:ws_prune_ratio_prtl: optimal
INFO:tensorflow:ws_nb_rlouts: 200
INFO:tensorflow:ws_nb_rlouts_min: 50
INFO:tensorflow:ws_reward_type: single-obj
INFO:tensorflow:ws_lrn_rate_rg: 0.03
INFO:tensorflow:ws_nb_iters_rg: 20
INFO:tensorflow:ws_lrn_rate_ft: 0.0003
INFO:tensorflow:ws_nb_iters_ft: 400
INFO:tensorflow:ws_nb_iters_feval: 25
INFO:tensorflow:ws_prune_ratio_exp: 3.0
INFO:tensorflow:ws_iter_ratio_beg: 0.1
INFO:tensorflow:ws_iter_ratio_end: 0.5
INFO:tensorflow:ws_mask_update_step: 500.0
INFO:tensorflow:cp_lasso: True
INFO:tensorflow:cp_quadruple: False
INFO:tensorflow:cp_reward_policy: accuracy
INFO:tensorflow:cp_nb_points_per_layer: 10
INFO:tensorflow:cp_nb_batches: 30
INFO:tensorflow:cp_prune_option: auto
INFO:tensorflow:cp_prune_list_file: ratio.list
INFO:tensorflow:cp_channel_pruned_path: ./models/pruned_model.ckpt
INFO:tensorflow:cp_best_path: ./models/best_model.ckpt
INFO:tensorflow:cp_original_path: ./models/original_model.ckpt
INFO:tensorflow:cp_preserve_ratio: 0.5
INFO:tensorflow:cp_uniform_preserve_ratio: 0.6
INFO:tensorflow:cp_noise_tolerance: 0.15
INFO:tensorflow:cp_lrn_rate_ft: 0.0001
INFO:tensorflow:cp_nb_iters_ft_ratio: 0.2
INFO:tensorflow:cp_finetune: False
INFO:tensorflow:cp_retrain: False
INFO:tensorflow:cp_list_group: 1000
INFO:tensorflow:cp_nb_rlouts: 200
INFO:tensorflow:cp_nb_rlouts_min: 50
INFO:tensorflow:cpg_save_path: ./models_cpg/model.ckpt
INFO:tensorflow:cpg_save_path_eval: ./models_cpg_eval/model.ckpt
INFO:tensorflow:cpg_prune_ratio_type: uniform
INFO:tensorflow:cpg_prune_ratio: 0.5
INFO:tensorflow:cpg_skip_ht_layers: True
INFO:tensorflow:cpg_prune_ratio_file: None
INFO:tensorflow:cpg_lrn_rate_pgd_init: 1e-10
INFO:tensorflow:cpg_lrn_rate_pgd_incr: 1.4
INFO:tensorflow:cpg_lrn_rate_pgd_decr: 0.7
INFO:tensorflow:cpg_lrn_rate_adam: 0.01
INFO:tensorflow:cpg_nb_iters_layer: 1000
INFO:tensorflow:dcp_save_path: ./models_dcp/model.ckpt
INFO:tensorflow:dcp_save_path_eval: ./models_dcp_eval/model.ckpt
INFO:tensorflow:dcp_prune_ratio: 0.5
INFO:tensorflow:dcp_nb_stages: 4
INFO:tensorflow:dcp_lrn_rate_adam: 0.001
INFO:tensorflow:dcp_nb_iters_block: 10000
INFO:tensorflow:dcp_nb_iters_layer: 500
INFO:tensorflow:uql_equivalent_bits: 4
INFO:tensorflow:uql_nb_rlouts: 200
INFO:tensorflow:uql_w_bit_min: 2
INFO:tensorflow:uql_w_bit_max: 8
INFO:tensorflow:uql_tune_layerwise_steps: 100
INFO:tensorflow:uql_tune_global_steps: 2000
INFO:tensorflow:uql_tune_save_path: ./rl_tune_models/model.ckpt
INFO:tensorflow:uql_tune_disp_steps: 300
INFO:tensorflow:uql_enbl_random_layers: True
INFO:tensorflow:uql_enbl_rl_agent: False
INFO:tensorflow:uql_enbl_rl_global_tune: True
INFO:tensorflow:uql_enbl_rl_layerwise_tune: False
INFO:tensorflow:uql_weight_bits: 4
INFO:tensorflow:uql_activation_bits: 32
INFO:tensorflow:uql_use_buckets: False
INFO:tensorflow:uql_bucket_size: 256
INFO:tensorflow:uql_quant_epochs: 60
INFO:tensorflow:uql_save_quant_model_path: ./uql_quant_models/uql_quant_model.ckpt
INFO:tensorflow:uql_quantize_all_layers: False
INFO:tensorflow:uql_bucket_type: channel
INFO:tensorflow:uqtf_save_path: ./models_uqtf/model.ckpt
INFO:tensorflow:uqtf_save_path_eval: ./models_uqtf_eval/model.ckpt
INFO:tensorflow:uqtf_weight_bits: 8
INFO:tensorflow:uqtf_activation_bits: 8
INFO:tensorflow:uqtf_quant_delay: 0
INFO:tensorflow:uqtf_freeze_bn_delay: None
INFO:tensorflow:uqtf_lrn_rate_dcy: 0.01
INFO:tensorflow:nuql_equivalent_bits: 4
INFO:tensorflow:nuql_nb_rlouts: 200
INFO:tensorflow:nuql_w_bit_min: 2
INFO:tensorflow:nuql_w_bit_max: 8
INFO:tensorflow:nuql_tune_layerwise_steps: 100
INFO:tensorflow:nuql_tune_global_steps: 2101
INFO:tensorflow:nuql_tune_save_path: ./rl_tune_models/model.ckpt
INFO:tensorflow:nuql_tune_disp_steps: 300
INFO:tensorflow:nuql_enbl_random_layers: True
INFO:tensorflow:nuql_enbl_rl_agent: False
INFO:tensorflow:nuql_enbl_rl_global_tune: True
INFO:tensorflow:nuql_enbl_rl_layerwise_tune: False
INFO:tensorflow:nuql_init_style: quantile
INFO:tensorflow:nuql_opt_mode: weights
INFO:tensorflow:nuql_weight_bits: 4
INFO:tensorflow:nuql_activation_bits: 32
INFO:tensorflow:nuql_use_buckets: False
INFO:tensorflow:nuql_bucket_size: 256
INFO:tensorflow:nuql_quant_epochs: 60
INFO:tensorflow:nuql_save_quant_model_path: ./nuql_quant_models/model.ckpt
INFO:tensorflow:nuql_quantize_all_layers: False
INFO:tensorflow:nuql_bucket_type: split
INFO:tensorflow:log_dir: ./logs
INFO:tensorflow:enbl_multi_gpu: False
INFO:tensorflow:learner: chn-pruned-gpu
INFO:tensorflow:exec_mode: train
INFO:tensorflow:debug: False
INFO:tensorflow:h: False
INFO:tensorflow:help: False
INFO:tensorflow:helpfull: False
INFO:tensorflow:helpshort: False
And the result is top5-acc 75.6% after 100w iters
when non-uniform, the command is
sh ./scripts/run_local.sh nets/resnet_at_ilsvrc12_run.py --learner=non-uniform --uql_weight_bits=8 --uql_activation_bits=8 --uql_use_buckets=True --uql_bucket_type=channel resnet_size=50
the params is
INFO:tensorflow:FLAGS:
INFO:tensorflow:data_disk: local
INFO:tensorflow:data_hdfs_host: None
INFO:tensorflow:data_dir_local: /mnt/kubernetes/ImageNet
INFO:tensorflow:data_dir_hdfs: None
INFO:tensorflow:cycle_length: 4
INFO:tensorflow:nb_threads: 8
INFO:tensorflow:buffer_size: 1024
INFO:tensorflow:prefetch_size: 8
INFO:tensorflow:nb_classes: 1001
INFO:tensorflow:nb_smpls_train: 1281167
INFO:tensorflow:nb_smpls_val: 10000
INFO:tensorflow:nb_smpls_eval: 50000
INFO:tensorflow:batch_size: 64
INFO:tensorflow:batch_size_eval: 100
INFO:tensorflow:resnet_size: 18
INFO:tensorflow:nb_epochs_rat: 1.0
INFO:tensorflow:lrn_rate_init: 0.1
INFO:tensorflow:batch_size_norm: 256.0
INFO:tensorflow:momentum: 0.9
INFO:tensorflow:loss_w_dcy: 0.0001
INFO:tensorflow:model_http_url: https://api.ai.tencent.com/pocketflow
INFO:tensorflow:summ_step: 100
INFO:tensorflow:save_step: 10000
INFO:tensorflow:save_path: ./models/model.ckpt
INFO:tensorflow:save_path_eval: ./models_eval/model.ckpt
INFO:tensorflow:enbl_dst: False
INFO:tensorflow:enbl_warm_start: False
INFO:tensorflow:loss_w_dst: 4.0
INFO:tensorflow:tempr_dst: 4.0
INFO:tensorflow:save_path_dst: ./models_dst/model.ckpt
INFO:tensorflow:ddpg_actor_depth: 2
INFO:tensorflow:ddpg_actor_width: 64
INFO:tensorflow:ddpg_critic_depth: 2
INFO:tensorflow:ddpg_critic_width: 64
INFO:tensorflow:ddpg_noise_type: param
INFO:tensorflow:ddpg_noise_prtl: tdecy
INFO:tensorflow:ddpg_noise_std_init: 1.0
INFO:tensorflow:ddpg_noise_dst_finl: 0.01
INFO:tensorflow:ddpg_noise_adpt_rat: 1.03
INFO:tensorflow:ddpg_noise_std_finl: 1e-05
INFO:tensorflow:ddpg_rms_eps: 0.0001
INFO:tensorflow:ddpg_tau: 0.01
INFO:tensorflow:ddpg_gamma: 0.9
INFO:tensorflow:ddpg_lrn_rate: 0.001
INFO:tensorflow:ddpg_loss_w_dcy: 0.0
INFO:tensorflow:ddpg_record_step: 1
INFO:tensorflow:ddpg_batch_size: 64
INFO:tensorflow:ddpg_enbl_bsln_func: True
INFO:tensorflow:ddpg_bsln_decy_rate: 0.95
INFO:tensorflow:ws_save_path: ./models_ws/model.ckpt
INFO:tensorflow:ws_prune_ratio: 0.75
INFO:tensorflow:ws_prune_ratio_prtl: optimal
INFO:tensorflow:ws_nb_rlouts: 200
INFO:tensorflow:ws_nb_rlouts_min: 50
INFO:tensorflow:ws_reward_type: single-obj
INFO:tensorflow:ws_lrn_rate_rg: 0.03
INFO:tensorflow:ws_nb_iters_rg: 20
INFO:tensorflow:ws_lrn_rate_ft: 0.0003
INFO:tensorflow:ws_nb_iters_ft: 400
INFO:tensorflow:ws_nb_iters_feval: 25
INFO:tensorflow:ws_prune_ratio_exp: 3.0
INFO:tensorflow:ws_iter_ratio_beg: 0.1
INFO:tensorflow:ws_iter_ratio_end: 0.5
INFO:tensorflow:ws_mask_update_step: 500.0
INFO:tensorflow:cp_lasso: True
INFO:tensorflow:cp_quadruple: False
INFO:tensorflow:cp_reward_policy: accuracy
INFO:tensorflow:cp_nb_points_per_layer: 10
INFO:tensorflow:cp_nb_batches: 30
INFO:tensorflow:cp_prune_option: auto
INFO:tensorflow:cp_prune_list_file: ratio.list
INFO:tensorflow:cp_channel_pruned_path: ./models/pruned_model.ckpt
INFO:tensorflow:cp_best_path: ./models/best_model.ckpt
INFO:tensorflow:cp_original_path: ./models/original_model.ckpt
INFO:tensorflow:cp_preserve_ratio: 0.5
INFO:tensorflow:cp_uniform_preserve_ratio: 0.6
INFO:tensorflow:cp_noise_tolerance: 0.15
INFO:tensorflow:cp_lrn_rate_ft: 0.0001
INFO:tensorflow:cp_nb_iters_ft_ratio: 0.2
INFO:tensorflow:cp_finetune: False
INFO:tensorflow:cp_retrain: False
INFO:tensorflow:cp_list_group: 1000
INFO:tensorflow:cp_nb_rlouts: 200
INFO:tensorflow:cp_nb_rlouts_min: 50
INFO:tensorflow:cpg_save_path: ./models_cpg/model.ckpt
INFO:tensorflow:cpg_save_path_eval: ./models_cpg_eval/model.ckpt
INFO:tensorflow:cpg_prune_ratio_type: uniform
INFO:tensorflow:cpg_prune_ratio: 0.5
INFO:tensorflow:cpg_skip_ht_layers: True
INFO:tensorflow:cpg_prune_ratio_file: None
INFO:tensorflow:cpg_lrn_rate_pgd_init: 1e-10
INFO:tensorflow:cpg_lrn_rate_pgd_incr: 1.4
INFO:tensorflow:cpg_lrn_rate_pgd_decr: 0.7
INFO:tensorflow:cpg_lrn_rate_adam: 0.01
INFO:tensorflow:cpg_nb_iters_layer: 1000
INFO:tensorflow:dcp_save_path: ./models_dcp/model.ckpt
INFO:tensorflow:dcp_save_path_eval: ./models_dcp_eval/model.ckpt
INFO:tensorflow:dcp_prune_ratio: 0.5
INFO:tensorflow:dcp_nb_stages: 3
INFO:tensorflow:dcp_lrn_rate_adam: 0.001
INFO:tensorflow:dcp_nb_iters_block: 10000
INFO:tensorflow:dcp_nb_iters_layer: 500
INFO:tensorflow:uql_equivalent_bits: 4
INFO:tensorflow:uql_nb_rlouts: 200
INFO:tensorflow:uql_w_bit_min: 2
INFO:tensorflow:uql_w_bit_max: 8
INFO:tensorflow:uql_tune_layerwise_steps: 100
INFO:tensorflow:uql_tune_global_steps: 2000
INFO:tensorflow:uql_tune_save_path: ./rl_tune_models/model.ckpt
INFO:tensorflow:uql_tune_disp_steps: 300
INFO:tensorflow:uql_enbl_random_layers: True
INFO:tensorflow:uql_enbl_rl_agent: False
INFO:tensorflow:uql_enbl_rl_global_tune: True
INFO:tensorflow:uql_enbl_rl_layerwise_tune: False
INFO:tensorflow:uql_weight_bits: 8
INFO:tensorflow:uql_activation_bits: 8
INFO:tensorflow:uql_use_buckets: True
INFO:tensorflow:uql_bucket_size: 256
INFO:tensorflow:uql_quant_epochs: 60
INFO:tensorflow:uql_save_quant_model_path: ./uql_quant_models/uql_quant_model.ckpt
INFO:tensorflow:uql_quantize_all_layers: False
INFO:tensorflow:uql_bucket_type: channel
INFO:tensorflow:uqtf_save_path: ./models_uqtf/model.ckpt
INFO:tensorflow:uqtf_save_path_eval: ./models_uqtf_eval/model.ckpt
INFO:tensorflow:uqtf_weight_bits: 8
INFO:tensorflow:uqtf_activation_bits: 8
INFO:tensorflow:uqtf_quant_delay: 0
INFO:tensorflow:uqtf_freeze_bn_delay: None
INFO:tensorflow:uqtf_lrn_rate_dcy: 0.01
INFO:tensorflow:nuql_equivalent_bits: 4
INFO:tensorflow:nuql_nb_rlouts: 200
INFO:tensorflow:nuql_w_bit_min: 2
INFO:tensorflow:nuql_w_bit_max: 8
INFO:tensorflow:nuql_tune_layerwise_steps: 100
INFO:tensorflow:nuql_tune_global_steps: 2101
INFO:tensorflow:nuql_tune_save_path: ./rl_tune_models/model.ckpt
INFO:tensorflow:nuql_tune_disp_steps: 300
INFO:tensorflow:nuql_enbl_random_layers: True
INFO:tensorflow:nuql_enbl_rl_agent: False
INFO:tensorflow:nuql_enbl_rl_global_tune: True
INFO:tensorflow:nuql_enbl_rl_layerwise_tune: False
INFO:tensorflow:nuql_init_style: quantile
INFO:tensorflow:nuql_opt_mode: weights
INFO:tensorflow:nuql_weight_bits: 4
INFO:tensorflow:nuql_activation_bits: 32
INFO:tensorflow:nuql_use_buckets: False
INFO:tensorflow:nuql_bucket_size: 256
INFO:tensorflow:nuql_quant_epochs: 60
INFO:tensorflow:nuql_save_quant_model_path: ./nuql_quant_models/model.ckpt
INFO:tensorflow:nuql_quantize_all_layers: False
INFO:tensorflow:nuql_bucket_type: split
INFO:tensorflow:log_dir: ./logs
INFO:tensorflow:enbl_multi_gpu: False
INFO:tensorflow:learner: non-uniform
INFO:tensorflow:exec_mode: train
INFO:tensorflow:debug: False
INFO:tensorflow:h: False
INFO:tensorflow:help: False
INFO:tensorflow:helpfull: False
INFO:tensorflow:helpshort: False
The tp5-acc is between 60%~80%
By the way, do you have any QQ group or WeChat group for discussion or communication?
We have created a QQ group (ID: 827277965) for technical discussions. @hzhyhx1117
I can't get the model converge on every learner , either imagenet or cifar10。Any suggestion about the hyper-parameters?Thanks