alibaba / MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba
http://www.mnn.zone/
8.44k stars 1.63k forks source link

Error in offline int8 quantization of yolov8n model from ultralytics #2919

Open Deephome opened 1 week ago

Deephome commented 1 week ago

Platform(Include target platform as well if cross-compiling):

aarch64, ubuntu20.04

Github版本:

commit a980dba3963efb0ad76b0f3caaf5c21556f69ffe (HEAD -> master, origin/master, origin/HEAD) Merge: 226f1bc1 1924cc17 Author: jxt1234 jxt1234@zju.edu.cn Date: Sat Jun 15 16:22:48 2024 +0800

Compiling Method

cmake -DMNN_USE_OPENCV=ON -DMNN_IMGCODECS=ON -DMNN_BUILD_TOOL=ON -DMNN_BUILD_BENCHMARK=ON -DMNN_BUILD_CONVERTER=ON -DMNN_BUILD_QUANTOOLS=ON ..

Issue

I first use ultralytics to convert yolov8n.pt to onnx model yolov8n.onnx:

from ultralytics import YOLO model = YOLO("yolov8n.pt") model.export(format="onnx")

then convert it to mnn model as follows:

MNNConvert -f ONNX --modelFile .\yolov8n.onnx --MNNModel yolov8n.mnn --bizCode biz --keepInputFormat

The yolov8n.mnn works well using mnn-yolo.

However, the error occurs when I quantized the yolov8n.mnn using command:

../MNN/build/quantized.out ./checkpoints/yolov8n.mnn ./checkpoints/yolov8n_quant.mnn ./data/yolov8n_quant.json

with yolov8n_quant.json:

{
    "format":"RGB",
    "mean": [
        0.0,
        0.0,
        0.0
    ],
    "normal": [
        0.003921,
        0.003921,
        0.003921
    ],
    "width":640,
    "height":640,
    "path":"/home/nvidia/Documents/mnn-yolo/data/coco",
    "used_image_num":32,
    "feature_quantize_method": "KL",
    "weight_quantize_method":"MAX_ABS",
    "model":"../checkpoints/yolov8n.mnn",
    "debug": false
}

the output:

[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:30:33] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
Can't find extraTensorDescribe for 427
[10:30:58] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1316: Quantize model done!

it seems the model is quantized successfully, but the inference results of yolov8n_quant.mnn are totally wrong. Then, I set debug as true in json file, the output become:

[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1278: >>> modelFile: ./checkpoints/yolov8n.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1279: >>> preTreatConfig: ./data/yolov8n_quant.json
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1280: >>> dstFile: ./checkpoints/yolov8n_quant_1.mnn
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:1308: Calibrate the feature and quantize model...
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:159: Use feature quantization method: KL
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:160: Use weight quantization method: MAX_ABS
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:180: feature_clamp_value: 127
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:181: weight_clamp_value: 127
The device support i8sdot:1, support fp16:1, support i8mm: 0
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/Helper.cpp:111: used image num: 32
[10:39:47] /home/nvidia/Documents/MNN/tools/quantization/calibration.cpp:666: fake quant weights done.
ComputeFeatureRange: 100.00 %
CollectFeatureDistribution: 100.00 %
[10:40:12] /home/nvidia/Documents/MNN/tools/quantization/TensorStatistic.cpp:331: Check failed: count == fakeQuantedFeature.size() (1638400 vs. 0) feature size errorSegmentation fault (core dumped)

The quantization process failed.

Please give me some suggestions, thank you!

v0jiuqi commented 1 week ago

上面已经提示了“Can't find extraTensorDescribe for 427”这个错误。在解决了

NOON47 commented 4 days ago

上面已经提示了“Can't find extraTensorDescribe for 427”这个错误。在解决了

这个问题解决了吗