jkjung-avt / tensorrt_demos

TensorRT MODNet, YOLOv4, YOLOv3, SSD, MTCNN, and GoogLeNet
https://jkjung-avt.github.io/
MIT License
1.74k stars 545 forks source link

Hard-coded computation in YOLO output tensor shapes #538

Closed ChristopheKar closed 2 years ago

ChristopheKar commented 2 years ago

In the YOLO to ONNX conversion, I noticed something weird at line 977: c = (category_num + 5) * 3. I'm guessing this is the same as the formula traditionally used to calculate the filters value in the [convolutional] layer before every [yolo] layer, where usually, filters = (num_classes + 5) * 3. However this formula only holds when the number of masks is 3, which is not always the case, and the actual formula is filters = (num_classes + 5) * num_masks, as stated in AlexeyAB's README (last bullet in 1.).

For example, here's an extract from the YOLOv4-P5 configuration:

[convolutional]
size=1
stride=1
pad=1
filters=340
activation=logistic

[yolo]
mask = 0,1,2,3
anchors = 13,17,  31,25,  24,51,  61,45,  48,102,  119,96, 97,189, 217,184,  171,384, 324,451, 616,618, 800,800
classes=80
num=12
jitter=.1

In this case, filters = (80 + 5) * 3 = 255 would be wrong, since filters=340, which means that the correct computation is (80 + 5) * 4 = 340, with 4 being the number of indexes present in mask, which makes it somewhat easy to extract from the configuration. I do not know how many models have 4 masks other than YOLOv4-P5 and YOLOv4-P6.

What do you think about this? Would that make the output tensor shapes wrong for these models?

jkjung-avt commented 2 years ago

You are right. The hard-coded "c" (number of filter channels from the convolutional layer proceeding the yolo layer) won't work for models that have number of masks (anchors) other than 3.

I think this part of the code could be fixed easily. But I'm not sure whether I have also hard-coded the same thing in other parts of the code...

Anyway, I think the code would error out (some assertion would fail) when you try to build such a TensorRT engine.

ChristopheKar commented 2 years ago

Thank you for your answer. I submitted a pull request (#540) solving this issue, and tested it with YOLOv4-P5, it seems to be working well, but not sure how to link the pull request to this issue.

jkjung-avt commented 2 years ago

I have accepted the pull request. Thanks.