XNOR-net yolov3-tiny model training

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.77k stars 7.96k forks source link

XNOR-net yolov3-tiny model training #1851

Open SergeiSamuilov opened 6 years ago

SergeiSamuilov commented 6 years ago

I trained tiny yolov3 model on one class (face) based on this cfg-file: https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov3-tiny_xnor.cfg As instructed, I used command to get initial weights for training: darknet.exe partial yolov3-tiny-xnor-obj.cfg yolov3-tiny.weights yolov3-tiny.conv.15 15. While inferencing trained model, I've had very few detections and low mAP:

detections_count = 26375, unique_truth_count = 968 class_id = 0, name = face, ap = 27.35 % for thresh = 0.25, precision = 0.78, recall = 0.14, F1-score = 0.24 for thresh = 0.25, TP = 134, FP = 37, FN = 834, average IoU = 56.16 % mean average precision (mAP) = 0.273530, or 27.35 %

Model was trained for 2400 iterations (avg loss ~ 1.4, wasn't changing since 2000 iterations), dataset included 3200 train images and 850 val images, also I generated 6 anchors by calc_anchors.

Could you please clarify the concept of training xnor-net model and explain my steps in improving this model's results.

AlexeyAB commented 6 years ago

Try to comment these 3 lines:
And train from the begining at least 6000 iterations
Then check mAP

SergeiSamuilov commented 6 years ago

Thank you for your advice. Btw, regular yolov3-tiny showed great perfomance. Before I started training xnor-net model with new settings, could you please answer some more questions about dataset:

1) Should I include negative samples in validation set as well. I have 11k face images and the same amount of true negative images (background), but valid set consists of only 3k true positive face images.

2) I limited maximum number of faces in one image to 20. Should I use parameter max=200 stated in tutorial, if it's applicable to tiny model at all.

AlexeyAB commented 6 years ago

@SergeiSamuilov

Should I include negative samples in validation set as well. I have 11k face images and the same amount of true negative images (background), but valid set consists of only 3k true positive face images.

As you want.

I limited maximum number of faces in one image to 20. Should I use parameter max=200 stated in tutorial, if it's applicable to tiny model at all.

What do you mean? All objects on the image should be labeled. You should set max=200 only if there are more than 90 objects on the Training image: https://github.com/AlexeyAB/darknet/blob/2c5e383c04655fe45f3f533eb3a69a80acbf3561/src/parser.c#L278

SergeiSamuilov commented 6 years ago

Thank you again, Alexey. I've trained the model using the proposed settings, but still get low mAP and high loss (mAP 12%, avg loss 3,5, trained for >10k iter.). Assuming that such poor results could be caused by inappropriate training material, I have tried different datasets (IMDB face dataset, WIDERface) and used different max number of faces in one image (max = 1 face, 20 faces, 90 faces) , but still got similar results.

I limited maximum number of faces in one image to 20. Should I use parameter max=200 stated in tutorial, if it's applicable to tiny model at all.

What do you mean? All objects on the image should be labeled. You should set max=200 only if there are more than 90 objects on the Training image:

I manually handpicked all the images meeting the criteria, parsing annotation files. Of course, all the objects were labeled, I just didn't use the images with more objects.

Are there any more ways to improve performance of xnor-model based on fine-tuning or enhancing dataset? And sorry for being importunate, if there's nothing I can do with xnor model, I'll just stick to the default yolov3-tiny, which works great.

AlexeyAB commented 6 years ago

@SergeiSamuilov Hi,

Try to train this model: yolov3-tiny_fp32_xnor.cfg.txt

Also set random=1 in the both [yolo] layers.

I'v got mAP = 87.59 % by using yolov3-tiny_fp32_xnor.cfg after 10 000 iterations on my own dataset, while I'v got mAP = 90.77 % by using common yolov3-tiny.cfg.

Command for training: darknet.exe detector train data/obj.data yolov3-tiny_fp32_xnor.cfg yolov3-tiny.conv.15

But I have small number of small objects.

yolov3-tiny_fp32_xnor.cfg - avg loss 0.42

yolov3-tiny_fp32_xnor

yolov3-tiny.cfg - avg loss 0.26

yolov3-tiny_obj

PiseyYou commented 4 years ago

@AlexeyAB I check the yolov3-tiny_fp32_xnor.cfg.txt above you upload, in this file, there are 6 location to commont the #xnor=1, base on instruction, I understand the first 5 location to commont expect the last commont one on line 159, in your way, commont the xnor=1 before the last yolo detection will get high mAP, the last commont line should be in the line 174, but not line 159, is this a wrong line commont bug?