AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.73k stars 7.96k forks source link

Upper limit of data required for training #499

Open shuchitagupta opened 6 years ago

shuchitagupta commented 6 years ago

Hi,

@AlexeyAB I need to detect a very difficult object that is street light. Could you please suggest that what will be the upper limit of the data required for training for a class with image size used as 832*832?

Thanks, Shuchita

AlexeyAB commented 6 years ago

@shuchitagupta Hi,

what will be the upper limit of the data required for training

Do you mean how much images is required? At least 2000 per class. The more, and the more varied, the better.

shuchitagupta commented 6 years ago

Also, should every class have equal number of samples? Also if I need to train it for negative samples, will it learn for it too? And how many image would I need of that?

AlexeyAB commented 6 years ago

Also, should every class have equal number of samples?

Not necessary, but desirable.

I use about 2000 images per class (very approximately), for example, for 6 classes I use 12 000 images. In addition, I use another 12000 images of negative samples.

shuchitagupta commented 6 years ago

Thanks a lot. Also what are your comments about YOLOv3. Will that do a better job?

TheExDeus commented 6 years ago

Should there be equal number of images or equal number of labels for each class? Because equal number of labels have worked quite well for me, but that is not always possible. I now have a domain where I have about 5 labels per image, 2 labels for one class, 2 for another, and 1 for the third. As darknet doesn't allow varying learning weight/impact per label then I cannot set like 0.30 to class 1, 0.30 to class 2 and 0.4 for class 3. So it is very possible that I learn the third class worse than the first two.

I also plan to augment my data by filling some labels with noise thus trying to delete them. Because I had about 3k images with 4 labels in a specific pattern and it ended up relying on other labels. So even if 1 wasn't found it made very probable the rest 3 won't be found either even though they are visible. Is deleting and filling a label with noise is a good practice?

AlexeyAB commented 6 years ago

@TheExDeus On the one hand - this can be a problem, on the other hand, there is a more significant problem - this diversity of look of objects (the more diverse the objects look, the better) - this introduces an imbalance here too. And this problem can not be solved by adjusting the coefficients such as 0.3, 0.3, 0.4 ...

One of solutions of both these problems is usage Focal Loss: https://arxiv.org/abs/1708.02002v2 I added Focal Loss in [yolo] layer for Yolo v3 in the last commit, so you can update code from this repo, then add focal_loss=1 in each of 3 [yolo] layers and train again. Then compare mAP (mean average precision) for training with and without focal loss. What mAP can you get now?

But there are two options to use Focal Loss, and I haven't time to check which of them is correct: https://github.com/AlexeyAB/darknet/blob/943f6e874b819271a87665cf41199388380989a0/src/yolo_layer.c#L128-L129 So you should train two times using each of them and compare mAP. (just comment one of this line and uncomment another, the re-compile make and train from the begining).

TheExDeus commented 6 years ago

I train yolo2 as I need more performance on jetson. I could try training yolo3 just for a test though.

AlexeyAB commented 6 years ago

@TheExDeus Yolo v2 has the same implementation of Focal loss, so you can try to use it in Yolo v2: https://github.com/AlexeyAB/darknet/blob/943f6e874b819271a87665cf41199388380989a0/src/region_layer.c#L139-L140

TaihuLight commented 6 years ago

@TheExDeus @AlexeyAB Dose the mAP is improved after trained with the focal loss for YOLOv3 ? https://github.com/AlexeyAB/darknet/issues/424 https://groups.google.com/forum/#!topic/darknet/0yFZGIOvHj8

Cartucho commented 6 years ago

@TaihuLight According to the YOLOv3 paper no.

Focal loss. We tried using focal loss. It dropped our mAP about 2 points. YOLOv3 may already be robust to the problem focal loss is trying to solve because it has sep- arate objectness predictions and conditional class predic- tions. Thus for most examples there is no loss from the class predictions? Or something? We aren’t totally sure.

TaihuLight commented 6 years ago

I want to know the result of models trained with this repo, since I hope the focal loss improves the performance of YOLO. However, the author of YOLOv3 said as above. @Cartucho

kmsravindra commented 6 years ago

@AlexeyAB , I am using yolov3-voc.cfg. Should I just add the line "focal_loss = 1" for each of the [yolo] layers? Do you think this might help balancing out the many negative background samples vs limited objects? Anyway, I will train and check with this cfg addition, but just wanted to gather your thoughts...thanks!

AlexeyAB commented 6 years ago

@kmsravindra Yes, you can try, focal_loss is implemented for Yolo v3, but Joseph said the it doesn't help. Just set focal_loss = 1 for each of [yolo]-layers