Custom training - Githubissues

KAIIUK commented 6 years ago

Hi Aleksey. I admire the work done by you, but now I have an extremely important question, or rather I ask for advice. I have a model trained for 1 class on 400000 images (both negative and positive ones are included), but basically they are the same. It works excellently on test videos that have roughly the same objects and background, but it does not work at all on random video (i-leeds for example). Give advice on how to improve the results and whether it makes sense to mix with coco for example or voc.

AlexeyAB commented 6 years ago

@KAIIUK Hi,

What mAP can you get on training dataset and any validation dataset?

it does not work at all on random video (i-leeds for example)

What is "i-leeds"?

The main reason why it may not work properly on another dataset/videos that this rule is not respected: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

General rule - your training dataset should include such a set of relative sizes of objects that you want to detect:

train_network_width * train_obj_width / train_image_width ~= detection_network_width * detection_obj_width / detection_image_width

train_network_height * train_obj_height / train_image_height ~= detection_network_height * detection_obj_height / detection_image_height

3 general reason why neural network may not detect objects:

different scales and aspect ratio - to solve it use: random=1 jitter=0.3 (or jitter=0.45 - but you should train 10 times more iterations)
different colours - to solve it use: exposure=1.5, saturation=1.5, hue=0.1 (in general: exposure>1, saturation>1, hue>0)
different rotation - this kind of data augmentation isn't implemented yet for Yolo

It means - different between training dataset and test-dataset/video.

KAIIUK commented 6 years ago

@AlexeyAB I just now realized that I asked a rather stupid question, because the model is likely to be overfiting, because there are only about 15 examples of the object and about 20k images for each. I also use your advice and see what happens. And another question - what will happen if as a basic network to use yolov3 weight. Спасибо.

AlexeyAB commented 6 years ago

I have a model trained for 1 class on 400000 images ... there are only about 15 examples of the object and about 20k images for each

So you have:

1 class
400 000 images

But what is the "15 examples"? You shouldn't use identical images.

It is better to use yolov3.conv.81 than yolov3.weights, because if you will train with yolov3.weights -clear then all weights starting from this layer will be shifted: https://github.com/AlexeyAB/darknet/blob/2bac3681fcecb225c5e5a6376d7f95835cd8f89e/cfg/yolov3.cfg#L599-L604 because you have different number of classes than in yolov3.cfg (80).

Just do to get yolov3.conv.81 file: ./darknet partial cfg/yolov3.cfg yolov3.weights yolov3.conv.81 81
then train ./darknet detector train data/obj.data yolo-obj.cfg yolov3.conv.81

or

./darknet detector train data/obj.data yolo-obj.cfg darknet53.conv.74

KAIIUK commented 6 years ago

@AlexeyAB Of course, I meant not clear yolov3.weights, I just wanted to know if conv from v3 is better than darknet53.conv.74. Many thanks)

AlexeyAB commented 6 years ago

@KAIIUK

yolov3.conv.81 is better if your objects looks like MS COCO object
darknet53.conv.74 is better for the remaining cases

kmsravindra commented 5 years ago

@AlexeyAB , Could you please educate on how I can figure out to which of the datasets (between COCO objects vs Pascal VOC ) is my images / objects more closely aligned? Is it determined based on the aspect ratio of the images or aspect ratio of objects? Thanks!

AlexeyAB / darknet

Custom training #1121