Different input size - Githubissues

cnmckee commented 5 years ago

Hello- I have a custom tinyyolov3 graph that takes an input size of 608x608. I'm running into problems however- i changed width and height in CAPS in gsttinyyolov3.c but the model isn't working as expected. I know that my model architecture is correct because an equivalently constructed model works at 416x416 with your plugin, although the boxes are not the correct size (I'm guess an anchor issue). Is there something else I need to change in the code? I have already messed with the anchors and no change.

cnmckee commented 5 years ago

Also, TinyYolov3 has 6 anchor points. Your anchors in TinyYolov3 resemble those of Yolov2 both in terms of being normalized to the grid and in that there are only 5. I don't see how you can claim that you have implemented TinyYolov3.

michaelgruner commented 5 years ago

Thanks for your feedback. @GFallasRR can you look into this? Please correlate the post process with the paper implementation.

cnmckee commented 5 years ago

Great thanks! @GFallasRR TinyYolov3, as implemented in the paper, predicts boxes at two different scales- a 13x13 grid and a 26x26 grid with three anchors for each cell, so in total 6 anchor points must be set (full Yolov3 predicts at three different scales with three anchors per cell, hence the 9 total anchor points to be set). The total number of predicted boxes should thus be ((13x13) + (26x26)) x 3 = 2535. After digging through your repo, I can see that your TinyYolov3 implementation does output 2535 boxes, but it looks like you are still using 5 anchors points. Perhaps you are detecting on three different layers of 13x13 grids with those same 5 anchor points per cell? This would also result in 2535 predicted boxes. Totally possible that I'm missing something or misinterpreting, but an essential upgrade in Yolov3/TinyYolov3 is detection at multiple scales and I don't see this in your code.

GFallasRR commented 4 years ago

Hi @cnmckee, sorry for the slow reply.

I have been checking the TinyYoloV3 documentation and I found some information related to the implementation on GstInference.

The preprocess and postprocess were based on this repository TensorFlow-YoloV3 that uses DarkNet's TinyYoloV3 weights.
The inference result from our plugin should be the same that the result from the Python demo application available in the previous mentioned repository, both inferences provide 2835 boxes.
The postprocess includes the detection from 13x13 and 26x26 scales but those scales are included in detection layers in the model. According to tiny_yolo_v3.py, the anchors are defined in detect_1 and detect_2 layers:

_ANCHORS = [(10, 14), (23, 27), (37, 58), (81, 82), (135, 169), (344, 319)]

detect_1 = _detection_layer( inputs, num_classes, _ANCHORS[3:6], img_size, data_format) detect_1 = tf.identity(detect_1, name='detect_1')

detect_2 = _detection_layer( inputs, num_classes, _ANCHORS[0:3], img_size, data_format) detect_2 = tf.identity(detect_2, name='detect_2')

detections = tf.concat([detect_1, detect_2], axis=1) detections = tf.identity(detections, name='detections')

Those layers are defined in the model and can be seen with TensorBoard. The detections node has the outputs from detection_1 and detection_2.

The output_boxes node have 2535 boxes but I agree that the definition for that value is wrong, the explanation that you provided ((13x13) + (26x26)) x 3 = 2535 is correct. The 2535 value defined at gstinferencepostprocess.c may cause confusion on postprocess. Just in case, the postprocess from TensorFlow-YoloV3 should be equivalent to GstInference's postprocess used in TinyYoloV3 but we may be missing something if the result is not the same.
According to the TinyYoloV3 model saved from TensorFlow-YoloV3 repository, there are more nodes after TinyYoloV3's output, those nodes should be changing the boxes values from normalized to pixel values, this may be causing some problems on the box dimensions.

TinyYoloV3 graph using TensorBoard:

TinyYoloV3 plugin was tested using 416x416 inputs.

I may be missing other details about the TinyYoloV3 implementation but I will check if there are more models to improve the plugin.

Any comment or extra information about the issue that you are facing will be appreciated. Thanks.

RidgeRun / gst-inference

Different input size #214