how to get the int8 model?

tusen-ai / simpledet

A Simple and Versatile Framework for Object Detection and Instance Recognition

Apache License 2.0

3.09k stars 487 forks source link

how to get the int8 model? #313

Closed xiaoyazhu closed 4 years ago

xiaoyazhu commented 4 years ago

Thank you for your work. After training quantized model, the inference speed of quantized model is still slow. How can i get the faster int8 model ? use tensorRT? Can you talk about the next process？

huangzehao commented 4 years ago

Hi, SimpleDet only provide the simulated quantization method which was proposed by [1]. After training, you can get the quantization range of each layers. For faster inference, you should use TensorRT or TVM, and adopt the quantization range of each layers obtained from int8 training.

[1] Jacob, Benoit, et al. "Quantization and training of neural networks for efficient integer-arithmetic-only inference." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

xiaoyazhu commented 4 years ago

Is it necessary to convert mxnet model to ONNX format, I cannot convert directly. Could you please give me some advice？thank you so much!