[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
Can't find extraTensorDescribe for 427
[10:30:58] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1316: Quantize model done!
it seems the model is quantized successfully, but the inference results of yolov8n_quant.mnn are totally wrong. Then, I set debug as true in json file, the output become:
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
[10:40:12] /home/nvidia/Documents/MNN/tools/quantization/TensorStatistic.cpp:331: Check failed: count == fakeQuantedFeature.size() (1638400 vs. 0) feature size errorSegmentation fault (core dumped)
Platform(Include target platform as well if cross-compiling):
aarch64, ubuntu20.04
Github版本:
commit a980dba3963efb0ad76b0f3caaf5c21556f69ffe (HEAD -> master, origin/master, origin/HEAD) Merge: 226f1bc1 1924cc17 Author: jxt1234 jxt1234@zju.edu.cn Date: Sat Jun 15 16:22:48 2024 +0800
Compiling Method
cmake -DMNN_USE_OPENCV=ON -DMNN_IMGCODECS=ON -DMNN_BUILD_TOOL=ON -DMNN_BUILD_BENCHMARK=ON -DMNN_BUILD_CONVERTER=ON -DMNN_BUILD_QUANTOOLS=ON ..
Issue
I first use ultralytics to convert yolov8n.pt to onnx model yolov8n.onnx:
from ultralytics import YOLO model = YOLO("yolov8n.pt") model.export(format="onnx")
then convert it to mnn model as follows:
MNNConvert -f ONNX --modelFile .\yolov8n.onnx --MNNModel yolov8n.mnn --bizCode biz --keepInputFormat
The yolov8n.mnn works well using mnn-yolo.
However, the error occurs when I quantized the yolov8n.mnn using command:
../MNN/build/quantized.out ./checkpoints/yolov8n.mnn ./checkpoints/yolov8n_quant.mnn ./data/yolov8n_quant.json
with yolov8n_quant.json:
the output:
it seems the model is quantized successfully, but the inference results of yolov8n_quant.mnn are totally wrong. Then, I set
debug
as true in json file, the output become:The quantization process failed.
Please give me some suggestions, thank you!