cyrusbehr / YOLOv8-TensorRT-CPP

YOLOv8 TensorRT C++ Implementation
MIT License
568 stars 70 forks source link

Can I use non-square ratio (seg_w/seg_h) for segmentation? #53

Open andrew-93 opened 7 months ago

andrew-93 commented 7 months ago

I have a model with non-square input size (832x512, both values are divisible by 32) for segmentation. Then the output size will be 208х128. Therefore, I should change some values in yolov8.h, as shown below:

// Segmentation config options
    int segChannels = 32;
    int segH = 128;
    int segW = 208;
    float segmentationThreshold = 0.5f;

But in this case, the error appears:

   terminate called after throwing an instance of 'cv::Exception'
  what():  OpenCV(4.8.0) /tmp/opencv-4.8.0/modules/core/src/matrix.cpp:808: error: (-215:Assertion failed) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function 'Mat'

When I use square input (for example, 512x512 for input, then segW x segH = 128x128 for output), then there is no error and segmentation works correctly. I noticed that the error occurs in the line dest = dest(roi); in yolov8.cpp file, because in this case dest.size(): [128 x 208] and roi.size(): [208 x 80]. Is there a way to run your segmentation component for such "non-square" width/height tensor sizes?

cyrusbehr commented 7 months ago

Hi @andrew-93 Are you able to upload your model? I can adapt the code to make it work.

andrew-93 commented 7 months ago

@cyrusbehr I'm sending 2 onnx models - with square (512x512) and non-square (832x512) shapes. Link: https://www.transfernow.net/dl/20240411Ouasa3dn Both of these files were generated from the source file yolov8x-seg.pt from Ultralytics: https://github.com/ultralytics/assets/releases/download/v8.1.0/yolov8x-seg.pt (conversion through model.export in Ultralytics repo)

cyrusbehr commented 7 months ago

Thanks. Will try to have a look over later today

cyrusbehr commented 7 months ago

I was able to reproduce your results. Seems my logic for computing the segmentation ROI is not correct. I'll need to do some further digging.