Different performance of Tiny Yolov3 & Yolov3 lite in terms of localization

jienwan commented 4 years ago

Hi David,

I have a general question of these 2 algos.

I used Tiny Yolov3 Mobilenet and also Yolov3 lite Mobilenet to train my dataset (image size 416*416 + 3 classes). Tipically, for one single image, the objects have simlilar sizes, but the number of objects depends (can be 40 or 30, can be 20 or 10). The issue is that Tiny Yolov3 will draw bbox in the middle of 2 objects, while Yolov3 lite perform pretty well. It's like 2 objects should get 2 bboxes, but in Tiny Yolov3, only one bbox covering half of one object and half of the other. This does not occur for all images and all objects. For some images with 32 objects, it can have around 4 of them being this wrong detections.

I tried to replace the backbone part 'Mobilenet' with some other nets, but this still happens. So my question is, is this a common issue with Tiny Yolov3 algo? Since it's a 'tiny' version, the performance will be not very good? Have you seen this happening before?

If it's not the issue of the algo, do you have any suggestions to improve the model performance in terms of localizing the objects?

Thanks so much!!

david8862 commented 4 years ago

@jienwan Performance of a trained model may be impact by many many factors, so it's quite hard to figure out the root cause of your problem in one word (a backbone issue or a tiny family issue, etc.). Generally, if you want to improve the bbox localization performance, DIoU/CIoU loss may worth to have a try.

jienwan commented 4 years ago

Thank you for the quick response! I totally understand that the performance can be affected by lots of factors. It's just that I tested different backbones with Tiny Yolov3, but this issue is not solved, so I suspect it's the issue of Tiny Yolov3. I'll check out CIoU and DIoU loss! Appreciate your help.

jienwan commented 4 years ago

@david8862 Just as a followup, I tried with DIoU loss but the model performance is similar to the one using IoU loss. It seems the loss which controls the localizations is not the factor. I also tried to use DIoU metric in NMS, but the predictions are still not improved. Maybe Tiny-Yolov3 is not very suitable for my data.

In addition, I have one more question: will anchors affect how the model detects objects. I'm using the default anchors (tiny_yolo3_anchors.txt file). Like I mentioned, my objects have similar sizes per image and there's not much variation (in terms of object sizes) from image to image. From the predictions, most of the bboxes are fine. But on the same test image, I observed that some of the boxes are much larger that even include 2 separate objects together. Can this be an issue with the choice of anchors? Is it worthy to try customized anchors?

Sorry for asking such a specific question. I'm new to computer vision and deep learning. I found your repo very useful and some of the models (like YoloV3 lite) really works well. But now, for some reasons, I can only use Tiny-Yolo family. So, I'm struggling to achieve a good performance. I'll really appreciate your help! Thanks!

david8862 commented 4 years ago

Have you ever compared the mAP performance between Tiny YOLOv3 & YOLOv3 Lite? From descriptioin I guess your issued object may be a bit small in the image, since one of the disadvantages for Tiny YOLOv3 family is the capalibity on small objects due to lack of prediction head on (input_shape/8) size. If so maybe you can adjust the Tiny YOLOv3 head to pick the feature map with (input_shape/8) size instead of (input_shape/16), and merge with top layer in FPN to improve the small object performance. Then a customized anchor set should also be necessary.

david8862 / keras-YOLOv3-model-set

Different performance of Tiny Yolov3 & Yolov3 lite in terms of localization #76