airockchip / rknn_model_zoo

Apache License 2.0
1.04k stars 190 forks source link

Yolov8/rk3588: Custom rknn is failed to give output with output tensors fmt=UNDEFINED #77

Open openedev opened 9 months ago

openedev commented 9 months ago

Hi,

I'm trying to deploy custom model. where pt gets converted to onnx and onnx gets converted to rknn. Both pt and onnx results the proper output at host. But the rknn doesn't give the expected output at rk3588 target and it shows same out image that input has.

Here are the details steps,

yolov8 $ yolo export model=./best.pt imgsz=640,640 format=onnx opset=12
rknn_model_zeoo $ $ python convert.py ../model/best.onnx rk3588
W __init__: rknn-toolkit2 version: 1.6.0+81f21f4d
--> Config model
done
--> Loading model
W load_onnx: It is recommended onnx opset 19, but your onnx model opset is 12!
W load_onnx: Model converted from pytorch, 'opset_version' should be set 19 in torch.onnx.export for successful convert!
Loading : 100%|███████████████████████████████████████████████| 186/186 [00:00<00:00, 100989.07it/s]
done
--> Building model
W build: found outlier value, this may affect quantization accuracy
const name               abs_mean    abs_std     outlier value
model.0.conv.weight      5.34        7.41        62.182      
GraphPreparing : 100%|██████████████████████████████████████████| 227/227 [00:00<00:00, 1283.46it/s]
Quantizating : 100%|██████████████████████████████████████████████| 227/227 [00:13<00:00, 17.00it/s]
W build: The default input dtype of 'images' is changed from 'float32' to 'int8' in rknn model for performance!
                       Please take care of this change when deploy rknn model with Runtime API!
W build: The default output dtype of 'output0' is changed from 'float32' to 'int8' in rknn model for performance!
                      Please take care of this change when deploy rknn model with Runtime API!
done
--> Export rknn model
done

On rk3588 target

rknn_yolov8_demo$ ./rknn_yolov8_demo /mnt/leaf/yolov8.rknn /mnt/leaf/plant.jpg 
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 1
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=output0, n_dims=3, dims=[1, 6, 8400, 0], n_elems=50400, size=50400, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=2.533227
model is NHWC input fmt
model input height=640, width=640, channel=3
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
src width=640 height=640 fmt=0x1 virAddr=0x0xaaaaffff7940 fd=0
dst width=640 height=640 fmt=0x1 virAddr=0x0xaaab00123950 fd=0
src_box=(0 0 639 639)
dst_box=(0 0 639 639)
color=0x72
rga_api version 1.10.0_[2]
rknn_run
write_image path: out.png width=640 height=640 channel=3 data=0xaaaaffff7940

However the default yolov8 onnx mentioned in rknn_model_zoo is working as expected.

rknn_yolov8_demo$ ./rknn_yolov8_demo /mnt/leaf/yolov8-default.rknn /mnt/leaf/plant.jpg 
load lable ./model/coco_80_labels_list.txt
model input num: 1, output num: 9
input tensors:
  index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922
output tensors:
  index=0, name=318, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-56, scale=0.110522
  index=1, name=onnx::ReduceSum_326, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003452
  index=2, name=331, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003482
  index=3, name=338, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-17, scale=0.098049
  index=4, name=onnx::ReduceSum_346, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003592
  index=5, name=350, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003755
  index=6, name=357, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-49, scale=0.078837
  index=7, name=onnx::ReduceSum_365, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003817
  index=8, name=369, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003835
model is NHWC input fmt
model input height=640, width=640, channel=3
origin size=640x640 crop size=640x640
input image: 640 x 640, subsampling: 4:2:0, colorspace: YCbCr, orientation: 1
scale=1.000000 dst_box=(0 0 639 639) allow_slight_change=1 _left_offset=0 _top_offset=0 padding_w=0 padding_h=0
src width=640 height=640 fmt=0x1 virAddr=0x0xaaab0e0510c0 fd=0
dst width=640 height=640 fmt=0x1 virAddr=0x0xaaab0e17d0d0 fd=0
src_box=(0 0 639 639)
dst_box=(0 0 639 639)
color=0x72
rga_api version 1.10.0_[2]
rknn_run
vase @ (398 371 530 502) 0.801
potted plant @ (320 131 618 503) 0.798
write_image path: out.png width=640 height=640 channel=3 data=0xaaab0e0510c0

The only difference I can see in not working is fmt=UNDEFINED

output tensors:
  index=0, name=output0, n_dims=3, dims=[1, 6, 8400, 0], n_elems=50400, size=50400, fmt=UNDEFINED, type=INT8, qnt_type=AFFINE, zp=-128, scale=2.533227

Any help where it gets wrong?

Jagan.

tylertroy commented 8 months ago

I was successful with following steps.

  1. Export your .pt model to .onnx using the custom method described at https://github.com/airockchip/ultralytics_yolov8/blob/main/RKOPT_README.md

    • Note: when modifying ultralytics/cfg/default.yaml I specified model:, data:, and classes:. However, I don't think it matters what data: or classes: are set to.
  2. Convert the model to rknn as before with convert.py

  3. Before compiling with build-linux.sh make two modifications.

    1. set OBJ_CLASS_NUM in postprocess.h to comport with your model .
    2. update coco_80_labels_list.txt with you names list
openedev commented 8 months ago

As I mentioned I did use below command instead of your step 1

yolo export model=./best.pt imgsz=640,640 format=onnx opset=12

Anything incorrect in onnx conversion?

In fact I did try your step 1 by changing the model but found below issue while testing onnx.

$ python test.py 
WARNING ⚠️ Unable to automatically guess model task, assuming 'task=detect'. Explicitly define task for your model, i.e. 'task=detect', 'segment', 'classify','pose' or 'obb'.
Loading plant1v8.onnx for ONNX Runtime inference...
WARNING ⚠️ Metadata not found for 'model=plant1v8.onnx'

Traceback (most recent call last):
  File "/home/build/shared/test.py", line 7, in <module>
    results = model(['plant.jpg'])  # return a list of Results objects
              ^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/engine/model.py", line 169, in __call__
    return self.predict(source, stream, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/engine/model.py", line 439, in predict
    return self.predictor.predict_cli(source=source) if is_cli else self.predictor(source=source, stream=stream)
                                                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/engine/predictor.py", line 206, in __call__
    return list(self.stream_inference(source, model, *args, **kwargs))  # merge list of Result into one
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/engine/predictor.py", line 292, in stream_inference
    self.results = self.postprocess(preds, im, im0s)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/models/yolo/detect/predict.py", line 25, in postprocess
    preds = ops.non_max_suppression(
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/build/conda/lib/python3.11/site-packages/ultralytics/utils/ops.py", line 239, in non_max_suppression
    x = x[xc[xi]]  # confidence
        ~^^^^^^^^
IndexError: The shape of the mask [80, 80] at index 0 does not match the shape of the indexed tensor [64, 80, 80] at index 0

Here is my test.py

from ultralytics import YOLO

# Load a model
model = YOLO('plant1v8.onnx')  # pretrained YOLOv8n model

# Run batched inference on a list of images
results = model(['plant.jpg'])  # return a list of Results objects

# Process results list
for result in results:
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    #result.show()  # display to screen
    result.save(filename='prediction_banana_onnx.jpg')  # save to disk

PS: test.py gave proper result when I use my method of onnx conversion

yolo export model=./best.pt imgsz=640,640 format=onnx opset=12
tylertroy commented 8 months ago

@openedev

If you want to use the rknn model with the yolov8 c++ code in this repo then you need to convert to onnx then rknn using the method I indicated. If you just need onnx then just use the ultralytics repo.

openedev commented 8 months ago

@tylertroy Yes for the above steps, I have used Yolov8 for converting pt to onnx - https://github.com/airockchip/ultralytics_yolov8/tree/main

And I did test the onnx with simple python script before proceeding to convert rknn using rknn_model_zoo. Eventually I need cpp code to test the rknn but the onnx it-self is failing with python test based on your step 1.

tylertroy commented 8 months ago

@openedev The reason relates to the unique post processing required for each model because of the difference in their output nodes. If you compare the graphs of the onnx models converted by each method you'll notice very different output nodes. You can visualize the graphs with the netron app. Just open each .onnx in a separate viewing tab and you'll understand what I'm referring to.

charileben commented 8 months ago

@tylertroy
Hey,
I have tried the process which you have mentioned in the following thread but model perform is not up to the mark it is detecting random pixels in image and throwing random labels. can you guess where we are going wrong.

tylertroy commented 8 months ago

@tylertroy Hey, I have tried the process which you have mentioned in the following thread but model perform is not up to the mark it is detecting random pixels in image and throwing random labels. can you guess where we are going wrong.

Is it detecting random pixels with your custom model or the default yolov8m model?

charileben commented 7 months ago

@tylertroy Yes it is detecting random pixels for custom dataset model developed using yolov8 model apologies for my delayed response