airockchip / rknn_model_zoo

Apache License 2.0
1.04k stars 190 forks source link

Conversion to RKNN Format Alters YOLOv8 Model Inference Results #111

Closed nerbivol closed 6 months ago

nerbivol commented 6 months ago

Description:

When executing the YOLOv8 model in ONNX format as demonstrated in the original repository, the model performs well, yielding satisfactory results as shown in result

However, upon conversion of the ONNX model to RKNN format and subsequent inference using the provided script, the obtained results as depicted in out

Steps to Reproduce:

  1. Clone the repository: https://github.com/airockchip/rknn_model_zoo
  2. Navigate to the YOLOv8 example: /examples/yolov8
  3. Follow the conversion instructions to convert the ONNX model to RKNN format.
  4. Execute the provided script with an input image.

    Example:

    root@orangepi5b:/datav8# ./rknn_yolov8_demo model/yolov8.rknn model/bus.jpg

    Output:

load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 9
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=318, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-58, scale=0.117659
  index=1, name=onnx::ReduceSum_326, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003104
  index=2, name=331, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003173
  index=3, name=338, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-45, scale=0.093747
  index=4, name=onnx::ReduceSum_346, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003594
  index=5, name=350, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003627
  index=6, name=357, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-34, scale=0.083036
  index=7, name=onnx::ReduceSum_365, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003874
  index=8, name=369, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
model is NHWC input fmt
model input height=640, width=640, channel=3
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
src width=640 height=640 fmt=0x1 virAddr=0x0x1666cca0 fd=0
dst width=640 height=640 fmt=0x1 virAddr=0x0x16798cb0 fd=0
src_box=(0 0 639 639)
dst_box=(0 0 639 639)
color=0x72
rga_api version 1.10.1_[0]
rknn_run
vase @ (211 291 254 503) 0.255
write_image path: out.png width=640 height=640 channel=3 data=0x1666cca0

Environment Details:

agualbbus commented 6 months ago

I'm encountering another issue, so I haven't been able to reproduce your case yet. However, it seems that converting to rknn may have quantized the model, resulting in a loss of accuracy. I recommend avoiding quantization, especially when working with yolo8n. The m and s versions are likely more suitable for quantization