The mAP decreases during training

RunchenWei commented 4 years ago

NEED HELP！！！ I am training Caltech Pedestrian Dataset. The training data (caltechx1 for 4k, caltechx10 for 40k) and testing data (caltech_test for 4k) are converted to VOC format, and I have used the YOLO_mark to check the annotation file, it is right. Pretraining model is darknet53.con.74. Only use person label (defalt 4 label: person, people, person? and person-far). I merge the four label to a single person label. But during training, the valid set mAP decreases after 5000 iteration. Here is loss curve and mAP curve. I have trained the dataset more than 50 times, it always keep decreasing after 5000 iteration. I am not sure if it is because of overfitting. chart What i tried but failed:

Recalculate anchor box for caltech dataset, no use.
Add new anchors to YOLO layer, no use.
Use letter box for input and valid, or do not use letter box, no use.
Use YOLOv3, YOLOv3-SPP, YOLO-tiny, YOLO+PAN..., gaussian loss, mse loss, GIoU loss..., no use.
Change different input size (640x480, 416x416, 608x608, 544x544), no use.
I download darknet after Dec 10th. Wait online for help... @AlexeyAB

AlexeyAB commented 4 years ago

What command do you use for training? Attach your cfg-file in zip. Did you compile with OpenCV?

Use YOLOv3, YOLOv3-SPP, YOLO-tiny, YOLO+PAN..., gaussian loss, mse loss, GIoU loss..., no use.

Did it increase accuracy? But still there was a decrease in accuracy after 5000 iterations?

Change different input size (640x480, 416x416, 608x608, 544x544), no use.

Did it increase accuracy?

RunchenWei commented 4 years ago

I use './darknet detector train caltech/caltechx1.data caltech/yolov3-baseline/yolov3-caltech.cfg darknet53.conv.74 -dont_show -map -letter_box' for training Yes, I compiled with OpenCV. All the models above increase the accuracy. But still there was a decrease in accuracy after 5000 iterations. myChart.zip

AlexeyAB commented 4 years ago

It seems that AP@75 - AP@90 increases but AP@50 decreases during training.

Try to set iou_normalizer=0.25 in each [yolo] layer, use letter_box=1 in [net] and train, does it help? Show chart png.
Also try to train more yolov3-spp-gaussian without iou_normalizer but with letter_box=1 in [net], to get full chart.png

chart

RunchenWei commented 4 years ago

I have tried the mAP keep decreasing per 10k iterations, finally to 40%... I also set 'iou_normalizer=0.25', 'iou_normalizer=0.5', 'iou_normalizer=0.1', but it still decreases. Should I have your idea about this issue?

AlexeyAB commented 4 years ago

I have tried the mAP keep decreasing per 10k iterations, finally to 40%...

Show these 2 charts:

Can you show chart.png for yolov3-spp-gaussian without iou_normalizer but with letter_box=1 ?
Change 0.5 to 0.75 there https://github.com/AlexeyAB/darknet/blob/63396082d7e77f4b460bdb2540469f5f1a3c7c48/src/detector.c#L284 recompile and show another chart.png for yolov3-spp-gaussian without iou_normalizer but with letter_box=1

RunchenWei commented 4 years ago

You are exactly right. In my cfg, the ignore_thresh is 0.7. But during the training period, the iou_thresh of validation is 0.5, that's the reason why the mAP decreases. The most important thing I think is the ignore_thresh 0.7 is too high for our model. I am a big fan of yours. Thank you!!!

AlexeyAB / darknet

The mAP decreases during training #4536