Open jerrydyc opened 1 year ago
please refer to https://github.com/huawei-noah/bolt/blob/master/docs/QUANTIZATION.md, there maybe some mismatch of gitbook and markdown.
Thank you! I get the int8 dynamic quantize model, but get error on inference below. The model is fully convolutional without Linear op. Could you tell me how to solve this error?
./benchmark -m model_int8_q.bolt
option is -m <boltModelPath>, value is: model_int8_q.bolt
[ERROR] thread 8069: can not create layer Quantize_/backbone/backbone/dark2/dark2.0/dconv/conv/Conv_output_09 type:OT_QuantizeLinear.
maybe your build not open int8 acceleration, please add --int8=on to install.sh.
Hello, I build Bolt(tag: v1.5.1) with the linux-x86_64_avx512 version, and convert onnx model to PTQ version by X2bolt.Then try post_training_quantization to quantize it to int8 precision. I follow the doc at https://huawei-noah.github.io/bolt/docs/QUANTIZATION.html. With the under procedure, I get the model model_f32.bolt, but can't get model_int8_q.bolt. Am I miss something? How to do int8 quantize and inference test? Thanks.
`./post_training_quantization -p model.bolt [INFO] thread 30247: environment variable BOLT_INT8_STORAGE_ERROR_THRESHOLD: 99999.000000 [INFO] thread 30247: Write bolt model to model_f32.bolt. Post Training Quantization Succeeded!
./post_training_quantization -V -p model.bolt -i INT8_FP16 option is -i [inferencePrecision], value is: INT8_FP32 [ERROR] thread 30243: The inferPrecision is Not Supported
./post_training_quantization -V -p model.bolt -i INT8_FP16 option is -i [inferencePrecision], value is: INT8_FP16 [ERROR] thread 30244: The inferPrecision is Not Supported`