Open dhingratul opened 5 years ago
@dhingratul We need some details to reproduce your issue.
Got it. We are reproducing your issue.
@dhingratul
Sorry, we cannot reproduce your issue.
According to our results (see the benchmark code for inference speed in PR #136), the model_original.pb
generated by export_chn_pruned_tflite_model.py
costs 3.23ms and the one generated by export_quant_tflite_model.py
costs 3.34ms, which basically are the same.
Some notes:
np.random.rand
. I use batch size and average over 1000 runs, leaving out the 1st inference(you use 100) because that time is always inflated due to GPU warmup. The difference in inference times could be because of the different GPU architecture. I am more interested in the percent speedup rather than actual numbers. model_dcp_eval/model_original.pb
run at ~5ms, and model_uqtf_eval/model_original.pb
run at ~8ms. @dhingratul
After changing np.zeros
to np.random.rand
, I still cannot reproduce your issue. Following is my results:
model_original.pb (chn-pruned): 3.48ms 3.41ms 3.27ms 3.26ms
model_original.pb (quant) 3.47ms 3.41ms 3.38ms 3.24ms
P.S.: I am using a P40 GPU.
Can you post your *.pb model files, so I can test on them?
@jiaxiang-wu DCP models https://drive.google.com/open?id=1NijcwZ-Cwd-Nqa73E2D5nTL_X2yhB32a UQTF Model https://drive.google.com/open?id=1LIYaJZclwBllEThoWZScj23Sq4_LkUxx
Thanks a lot. We are looking into this issue.
export_quant_tflite_model.py
to generate *.pb file,
the --model_file
is from models_eval
(full_prec_model) and models_uqtf_eval
(quant_model).
Describe the bug A clear and concise description of what the bug is. I ran two distinct experiments, one on uniform quantization, and one on channel pruning with the same resnet model, however, the outputs from both this optimization produced different
model_original
against which speedup is measured. The one from uniform quantization runs @ 25ms, and one from channel pruning @ 20ms. How are you measuring baseline ? To Reproduce Steps to reproduce the behavior:Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context Add any other context about the problem here.