Image for Rock 4 SE RK 3399

ajay01994 commented 1 year ago

Hi ,could you please also help me. with releasing the same model for Rock pi 4SE ,RK 3399 architecture.this would be really helpful since there are supply chain issues with Raspberry pi

https://forum.radxa.com/c/rockpi4

Qengineering commented 1 year ago

Dear @ajay01994,

If you look at Radxa's website, you are facing the same journey as I did. Install a Debian operating system. I used the native Radxa Debian, as being as close to the hardware as possible. Install OpenCV as described in our guide, just like the ncnn framework. Not much can go wrong.

ajay01994 commented 1 year ago

Hi ,

Thanks for the reply , do you have a model zoo for YOLO V5,V7,V8 for RK 3399 , Also could you help with TNN how to install that ?

Qengineering commented 1 year ago

The RK3399 has no NPU on board. You can use the ncnn models made for the Raspberry Pi. Or any other framework, like MNN, TNN, TensorFlow-Lite. Please see the updated guide on how to install TNN on a ARM board like your Rock Pi 4 SE.

ajay01994 commented 1 year ago

Hi yes it worked for Rk 3399 thanks a lot !

I also bought a rk3588 with NPU , and was able to run yolov5 , do you have inference code for yolo v7,v8 for NPU ,and help with steps to install and inference on NPU for these models

apanand14 commented 6 months ago

First of all great work @Qengineering . I appreciate your efforts.

Hi Ajay! can you tell me how can I run yolov5 as a demo on rk3588 with NPU. Thank you in advance.

And yes @Qengineering , I'm also interested in yolov8-seg inference with NPU. Thank you in advance you too!!

Qengineering commented 6 months ago

I will port the NPU examples for the RK3566 to the RK3588 and upload them to GitHub next week.

apanand14 commented 6 months ago

I have already ran build-linux_RK3588.sh file on rknpu2 examples but later I do not know what to do to make running yolov5 demo . If you guide me then it would be great. Thank you in advance

apanand14 commented 6 months ago

@Qengineering Finally, I ran successfully yolov5 demo on the image you provided. Now, I would like to test yolov8_seg based on RK3588 whenever you put it. I'm really looking forward to testing it. Thank you once again!

Qengineering commented 6 months ago

Please find: https://github.com/Qengineering/YoloV8-seg-NPU

apanand14 commented 6 months ago

Hi @Qengineering , thank you so much for it. But I'm not able to download Rock 5 image from your sync site. This new image contains yolov8 examples right? And One more thing, if I would like to run my custom trained model inference on Rock 5b with NPU then should I change in .cpp right? and the model itself. Is there anything should I need to tale care of or?

Qengineering commented 6 months ago

Sorry for the inconvenience. The nightly upload failed. Just restart it again. I let you know when the link is active.

Qengineering commented 6 months ago

To get your custom model up and running, you need to use the _rknntoolkit2 toolkit, installed on a Linux PC, you can port the model to an INT8 or FP16 rknn model. Please use the example that best fits your architecture. For instance, ~/rknn-toolkit2/rknn-toolkit2/examples/tensorflow. Here, use test.py to port your .pd model to .rknn. Sometimes you need to modify test.py a little to get everything working. Also, not every model can be ported to the NPU domain. And, of course, some models don't act well with INT8. Once running, postprocces.cpp is location where you alter the code to get your outputs. (Mostly) Best to install the toolkit in a virtual environment, as it will pull in many thrid party packages, like TensorFlow, Pytorch, PaddlePaddle etc.

Qengineering commented 6 months ago

The link is up and running.

apanand14 commented 6 months ago

Thank you!! Sure! I will try it and will update you or ask you if I need any assistance. One more thing about the image. I'm using the rock5b image already from this repo. is it sufficient to test new examples or should I use this Rock5_ubuntu22.img for yolov8 seg?

Qengineering commented 6 months ago

The latter. The Rock5B image doesn't have all the new examples. However, you could download the examples separately from the GitHub repo, as they were published this weekend. See the overview.

apanand14 commented 6 months ago

So, basically, I can download the examples repo and can use in the existing Rock5 image and no need to use newer one. Right?

Qengineering commented 6 months ago

OpenCV and the latest RKNPU2 library must be installed on your system. OpenCV is available. Most likely, the RKNPU2 library needs to be upgraded. Follow the instructions in the repo. It is only replacing one librknnrt.so and three *.h files.

apanand14 commented 6 months ago

Thanks again. I will check.

apanand14 commented 6 months ago

Hey @Qengineering, I'm trying to run the demo first and getting some error. could you please look into it please?

`rock@rock-5b:~/examples/YoloV8-seg-NPU$ ./YoloV8_seg busstop.jpg rk3588/yolov8n-seg.rknn model input num: 1, output num: 13 input tensors: index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 output tensors: index=0, name=375, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-55, scale=0.138304 index=1, name=onnx::ReduceSum_383, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.002873 index=2, name=388, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003169 index=3, name=354, n_dims=4, dims=[1, 32, 80, 80], n_elems=204800, size=204800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=26, scale=0.023277 index=4, name=395, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-40, scale=0.095424 index=5, name=onnx::ReduceSum_403, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003418 index=6, name=407, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=7, name=361, n_dims=4, dims=[1, 32, 40, 40], n_elems=51200, size=51200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=32, scale=0.020263 index=8, name=414, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-43, scale=0.075364 index=9, name=onnx::ReduceSum_422, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003874 index=10, name=426, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=11, name=368, n_dims=4, dims=[1, 32, 20, 20], n_elems=12800, size=12800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=26, scale=0.022538 index=12, name=347, n_dims=4, dims=[1, 32, 160, 160], n_elems=819200, size=819200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-119, scale=0.029378 model is NHWC input fmt model input height=640, width=640, channel=3 Unable to init server: Could not connect: Connection refused terminate called after throwing an instance of 'cv::Exception' what(): OpenCV(4.6.0) /home/rock/opencv/modules/highgui/src/window_gtk.cpp:635: error: (-2:Unspecified error) Can't initialize GTK backend in function 'cvInitSystem'

Aborted`

apanand14 commented 6 months ago

Now, I'm not getting this error but it got stuck after this ... ... model is NHWC input fmt model input height=640, width=640, channel=3 and not proceeding further.

Qengineering commented 6 months ago

Sorry to hear. It seems all related to OpenCV. Your first issue was throwing an exception on the GUI module. The second complains about the size of the input image not fitting the expected format. The input image is also resized by OpenCV before hand over to the NPU. I've tested all supplied examples twice. One for the RK3566 on a Radxa Zero 3w, and once for the RK3588 on a Rock 5 board. All worked well. You have two options. 1) Use the newly provided Rock 5 Ubuntu 22 image. 2) Debug OpenCV. Probably not in a healthy stage. A completely new installation may resolve the issue. I would opt for 1. If you are in desperate need of the original Armbian Rock 5 OS, perhaps I can install the examples and generate a new one this weekend. Not promising anything.

apanand14 commented 6 months ago

My bad representation. In the second issue, it does not throw any error regarding image size. It just got stuck after this. and later not proceeding at all. Not throwing any error. I will try with the new image but I really want to run it on the Armbian Rock5 OS. But if it gets solved here then it would be great as it's not throwing any error. Just stucking on something. model input num: 1, output num: 13 input tensors: index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 output tensors: index=0, name=375, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-61, scale=0.115401 index=1, name=onnx::ReduceSum_383, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003514 index=2, name=388, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003540 index=3, name=354, n_dims=4, dims=[1, 32, 80, 80], n_elems=204800, size=204800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=27, scale=0.019863 index=4, name=395, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-15, scale=0.099555 index=5, name=onnx::ReduceSum_403, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003555 index=6, name=407, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003680 index=7, name=361, n_dims=4, dims=[1, 32, 40, 40], n_elems=51200, size=51200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=30, scale=0.022367 index=8, name=414, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-55, scale=0.074253 index=9, name=onnx::ReduceSum_422, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003813 index=10, name=426, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=11, name=368, n_dims=4, dims=[1, 32, 20, 20], n_elems=12800, size=12800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=43, scale=0.019919 index=12, name=347, n_dims=4, dims=[1, 32, 160, 160], n_elems=819200, size=819200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-119, scale=0.032336 model is NHWC input fmt model input height=640, width=640, channel=3

apanand14 commented 6 months ago

Hi @Qengineering , I tried the new image Rock5 ubuntu one and still getting the same error as mentioned below. root@5:/home/rock/examples/YoloV8-seg-NPU# ./YoloV8_seg busstop.jpg rk3588/yolov8n-seg.rknn model input num: 1, output num: 13 input tensors: index=0, name=images, n_dims=4, dims=[1, 640, 640, 3], n_elems=1228800, size=1228800, fmt=NHWC, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 output tensors: index=0, name=375, n_dims=4, dims=[1, 64, 80, 80], n_elems=409600, size=409600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-55, scale=0.138304 index=1, name=onnx::ReduceSum_383, n_dims=4, dims=[1, 80, 80, 80], n_elems=512000, size=512000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.002873 index=2, name=388, n_dims=4, dims=[1, 1, 80, 80], n_elems=6400, size=6400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003169 index=3, name=354, n_dims=4, dims=[1, 32, 80, 80], n_elems=204800, size=204800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=26, scale=0.023277 index=4, name=395, n_dims=4, dims=[1, 64, 40, 40], n_elems=102400, size=102400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-40, scale=0.095424 index=5, name=onnx::ReduceSum_403, n_dims=4, dims=[1, 80, 40, 40], n_elems=128000, size=128000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003418 index=6, name=407, n_dims=4, dims=[1, 1, 40, 40], n_elems=1600, size=1600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=7, name=361, n_dims=4, dims=[1, 32, 40, 40], n_elems=51200, size=51200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=32, scale=0.020263 index=8, name=414, n_dims=4, dims=[1, 64, 20, 20], n_elems=25600, size=25600, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-43, scale=0.075364 index=9, name=onnx::ReduceSum_422, n_dims=4, dims=[1, 80, 20, 20], n_elems=32000, size=32000, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003874 index=10, name=426, n_dims=4, dims=[1, 1, 20, 20], n_elems=400, size=400, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-128, scale=0.003922 index=11, name=368, n_dims=4, dims=[1, 32, 20, 20], n_elems=12800, size=12800, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=26, scale=0.022538 index=12, name=347, n_dims=4, dims=[1, 32, 160, 160], n_elems=819200, size=819200, fmt=NCHW, type=INT8, qnt_type=AFFINE, zp=-119, scale=0.029378 model is NHWC input fmt model input height=640, width=640, channel=3 terminate called after throwing an instance of 'cv::Exception' what(): OpenCV(4.9.0-dev) /home/rock/opencv/modules/highgui/src/window_gtk.cpp:638: error: (-2:Unspecified error) Can't initialize GTK backend in function 'cvInitSystem'

Aborted (core dumped)

apanand14 commented 6 months ago

@Qengineering I was able to run it. Instead of using imshow. I just saved image to see if it works or not. Atleast demo is working. Next thing is, to make it run for custom model. I will update you for it. Thank you once again.

Qengineering commented 6 months ago

@apanand14,

1) Which monitor are you using? Or are you working with SSH? When you see these two lines (model is NHWC input fmt, model input height=640, width=640, channel=3) your NPU is propper configured and an OpenCV cv::imshow() window must show your output. Given the first error (Can't initialize GTK backend), OpenCV is the issue. It is not able to show something on the desktop.

2) Right now, I'm making a Rock-5 Armbian SD image with all the NPU examples. I've tested the YoloV8-seg-NPU, and it is working fine. Creating an image, zipping and uploading will take several hours. I will let you know when we have a live link.

apanand14 commented 6 months ago

Hi @Qengineering , I'm working with SSH only. And yes. GTK backend is the Opencv issue. But it's fine as long as I'm able to run and save the image with prediction and verify it. I'm trying to run my custom model. I will ask you if I need some assistance. Thank you for the help

Qengineering commented 6 months ago

FYI. I've just published a new SD image with the RKNPU model zoo examples.

apanand14 commented 6 months ago

Thank you! I will check it. One question to you. I have customized yolov8n-seg model. Customized I mean just different classes. Just 2 instead of 80. I think I do not have to change alot in cpp side right? I tried to run after converting my model but it didn't draw any mask but just showed FPS. Do you have any idea what could be wrong? Model conversion? or something needs to be changed in drawing mask? Thank you in advance.

Qengineering commented 6 months ago

Now you're on your own. I have no idea. Both can be the case. My problem is time. I'm very busy with other projects and, unfortunately, I can't dive into the code or the model to address the issue. How much I wanted to.

apanand14 commented 4 months ago

@Qengineering Hey! I was able to run everything properly regarding the issue mentioned above. I have another question that can I use this rock 5 image for my rock 5c and run models on RK3588S for inference? or do I have to make changes? Thank you in advance!

Qengineering commented 4 months ago

You can not use this image for the Rock5C. Althrough the same RK3588(s) CPU, there are low level I/O differences. Please install the proper image. Next, check our repo for RK3588 NPU. Install the ones needed. They will work out of the box. Or, copy your models from your Rock5.

apanand14 commented 3 months ago

Thank you @Qengineering ! I will try this way. But I have a question that when I convert my model from onnx to rknn, I have to specity target platform. Should I provide RK3588 or RK3588s?

Qengineering commented 3 months ago

RK3588 and RK3588s are identical from the RKNPU2 toolkit point of view. (Both have the same 3 NPU cores). In your Python convert script you define RK3588 as your target.

apanand14 commented 3 months ago

Thank you for your answer. So, basically I can use the same converted model i used for rock5b but on rock5c image which you mentioned above. Can i copy rknn-toolkit2(contains and software directory (contains yolov8_seg model) from rock 5 image (which you have created) to the new rock5c image and then if I run the script then it will work right?

Qengineering commented 3 months ago

Indeed. You're right. There is not that much difference between the Rock5B and Radxa Rock5C. Please note the heat. Without proper cooling, it hits the 85°C in no time.

apanand14 commented 3 months ago

Thank you so much for your inputs! @Qengineering . it worked well with RockPi 5C. But I have observed that when I try to run fp16 rknn converted model then I'm facing segmentation fault issue. What could be wrong when I use fp16 model? I use same versions and dependencies what I used for int 8 conversion. Thank you in advance for you answer.

Qengineering commented 3 months ago

fp16 requires different post-processing. Your output tensor is one flat array of numbers. These can be int8 or fp16. int8 you know. fp16 is a floating point stored in 16 bits. Normal floating points has 32 bits. You need to 'convert' the 16 to 32 to work with them. It is been done with this snippet of code:

std::vector<Box> get_detection_boxes_fp32(void *outputs, int grid_w, int grid_h, int num_classes, int num_anchors, const std::vector<float>& anchors, float threshold, int8_t zp, float sc)
{
    std::vector<Box> boxes;
    rknn_output *_outputs = (rknn_output *)outputs;     //_outputs are now float32
    float *out = (float *)_outputs[0].buf;
    int grid_len = grid_h * grid_w;
    int PropBoxSize = (5 + num_classes);

    for (int a = 0; a < 5; a++){                        //anchors
        for (int i = 0; i < grid_h; i++){               //height
            for (int j = 0; j < grid_w; j++){           //width
                float box_confidence = out[(PropBoxSize * a + 4) * grid_len + i * grid_w + j];

The void *outputs points to the fp16 array. A few lines later they are cast to fp32. Now you can work with them with normal C++ code. No need for additional fp16 library. (BTW, fp16 is not bfloat16. There are differences)

To call the routine:

    rknn_output outputs[rknn_app_ctx.io_num.n_output];
.............
        // allocate outputs
        memset(outputs, 0, sizeof(outputs));
        for(uint32_t i = 0; i < rknn_app_ctx.io_num.n_output; i++){
            outputs[i].index = i;
            outputs[i].want_float = (!rknn_app_ctx.is_quant);
        }
        // run
        rknn_run(rknn_app_ctx.rknn_ctx, nullptr);
        rknn_outputs_get(rknn_app_ctx.rknn_ctx, rknn_app_ctx.io_num.n_output, outputs, NULL);

        // post process
       std::vector<Box> boxes = get_detection_boxes_fp32(outputs, grid_w, grid_h, num_classes, num_anchors, anchors, threshold, rknn_app_ctx.output_attrs[0].zp, rknn_app_ctx.output_attrs[0].scale);

It are snippets of code from a larger program. I'm not sure if it is flawless. More a direction.

Qengineering commented 3 months ago

You can find a fp16 example of YoloV5 here.

Qengineering / Rock-5-image

Image for Rock 4 SE RK 3399 #2