RidgeRun / gst-inference

A GStreamer Deep Learning Inference Framework
GNU Lesser General Public License v2.1
121 stars 29 forks source link

Different input size #214

Open cnmckee opened 5 years ago

cnmckee commented 5 years ago

Hello- I have a custom tinyyolov3 graph that takes an input size of 608x608. I'm running into problems however- i changed width and height in CAPS in gsttinyyolov3.c but the model isn't working as expected. I know that my model architecture is correct because an equivalently constructed model works at 416x416 with your plugin, although the boxes are not the correct size (I'm guess an anchor issue). Is there something else I need to change in the code? I have already messed with the anchors and no change.

cnmckee commented 5 years ago

Also, TinyYolov3 has 6 anchor points. Your anchors in TinyYolov3 resemble those of Yolov2 both in terms of being normalized to the grid and in that there are only 5. I don't see how you can claim that you have implemented TinyYolov3.

michaelgruner commented 5 years ago

Thanks for your feedback. @GFallasRR can you look into this? Please correlate the post process with the paper implementation.

cnmckee commented 5 years ago

Great thanks! @GFallasRR TinyYolov3, as implemented in the paper, predicts boxes at two different scales- a 13x13 grid and a 26x26 grid with three anchors for each cell, so in total 6 anchor points must be set (full Yolov3 predicts at three different scales with three anchors per cell, hence the 9 total anchor points to be set). The total number of predicted boxes should thus be ((13x13) + (26x26)) x 3 = 2535. After digging through your repo, I can see that your TinyYolov3 implementation does output 2535 boxes, but it looks like you are still using 5 anchors points. Perhaps you are detecting on three different layers of 13x13 grids with those same 5 anchor points per cell? This would also result in 2535 predicted boxes. Totally possible that I'm missing something or misinterpreting, but an essential upgrade in Yolov3/TinyYolov3 is detection at multiple scales and I don't see this in your code.

GFallasRR commented 4 years ago

Hi @cnmckee, sorry for the slow reply.

I have been checking the TinyYoloV3 documentation and I found some information related to the implementation on GstInference.

_ANCHORS = [(10, 14), (23, 27), (37, 58), (81, 82), (135, 169), (344, 319)]

detect_1 = _detection_layer( inputs, num_classes, _ANCHORS[3:6], img_size, data_format) detect_1 = tf.identity(detect_1, name='detect_1')

detect_2 = _detection_layer( inputs, num_classes, _ANCHORS[0:3], img_size, data_format) detect_2 = tf.identity(detect_2, name='detect_2')

detections = tf.concat([detect_1, detect_2], axis=1) detections = tf.identity(detections, name='detections')

Those layers are defined in the model and can be seen with TensorBoard. The detections node has the outputs from detection_1 and detection_2.

TinyYoloV3 graph using TensorBoard: image

I may be missing other details about the TinyYoloV3 implementation but I will check if there are more models to improve the plugin.

Any comment or extra information about the issue that you are facing will be appreciated. Thanks.