Can't convert model to int8 precision with post_training_quantization

jerrydyc commented 1 year ago

Hello, I build Bolt(tag: v1.5.1) with the linux-x86_64_avx512 version, and convert onnx model to PTQ version by X2bolt.Then try post_training_quantization to quantize it to int8 precision. I follow the doc at https://huawei-noah.github.io/bolt/docs/QUANTIZATION.html. With the under procedure, I get the model model_f32.bolt, but can't get model_int8_q.bolt. Am I miss something? How to do int8 quantize and inference test? Thanks.

`./post_training_quantization -p model.bolt [INFO] thread 30247: environment variable BOLT_INT8_STORAGE_ERROR_THRESHOLD: 99999.000000 [INFO] thread 30247: Write bolt model to model_f32.bolt. Post Training Quantization Succeeded!

./post_training_quantization -V -p model.bolt -i INT8_FP16 option is -i [inferencePrecision], value is: INT8_FP32 [ERROR] thread 30243: The inferPrecision is Not Supported

./post_training_quantization -V -p model.bolt -i INT8_FP16 option is -i [inferencePrecision], value is: INT8_FP16 [ERROR] thread 30244: The inferPrecision is Not Supported`

yuxianzhi commented 1 year ago

please refer to https://github.com/huawei-noah/bolt/blob/master/docs/QUANTIZATION.md, there maybe some mismatch of gitbook and markdown.

jerrydyc commented 1 year ago

Thank you! I get the int8 dynamic quantize model, but get error on inference below. The model is fully convolutional without Linear op. Could you tell me how to solve this error?

./benchmark -m model_int8_q.bolt
option is -m <boltModelPath>, value is: model_int8_q.bolt
[ERROR] thread 8069: can not create layer Quantize_/backbone/backbone/dark2/dark2.0/dconv/conv/Conv_output_09 type:OT_QuantizeLinear.

yuxianzhi commented 1 year ago

maybe your build not open int8 acceleration, please add --int8=on to install.sh.

huawei-noah / bolt

Can't convert model to int8 precision with post_training_quantization #137