HouBiaoLiu commented 4 years ago

I train on coco dataset 80 classes, is max_batches in yolov4.cfg too big, and 2000*classes is 16K? I use 4 gpus. From 4K to 50K iterations, the avg loss is about 8.4, but the map increases from 45% to 57%. chart_yolov4

HouBiaoLiu commented 4 years ago

@AlexeyAB

I use yolov4.cfg, how can I get lower loss and higher map 62.8% mAP@0.5 as the paper said? batch=64 subdivisions=16

Training

width=512

height=512

width=416 height=416 channels=3 momentum=0.949 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.00065 （I use 4 gpus） burn_in=4000 max_batches = 500500 policy=steps steps=400000,450000 scales=.1,.1

cutmix=1

mosaic=1

AlexeyAB commented 4 years ago

I train on coco dataset 80 classes, is max_batches in yolov4.cfg too big, and 2000*classes is 16K?

https://github.com/AlexeyAB/darknet#when-should-i-stop-training

Usually sufficient 2000 iterations for each class(object), but not less than number of training images and not less than 6000 iterations in total. But for a more precise definition when you should stop training, use the following manual:

I use yolov4.cfg, how can I get lower loss and higher map 62.8% mAP@0.5 as the paper said?

You shouldn't change anything in cfg-file, except - higher subdivisions= - lower accuracy (AP):

[net]
batch=64
subdivisions=8 (or 16, or 32)
width=512
height=512

https://github.com/AlexeyAB/darknet/wiki/Train-Detector-on-MS-COCO-(trainvalno5k-2014)-dataset

(change width=512 height=512 in cfg-file)

mini_batch_size = batch / subdivisions, so higher subdivisions= - lower accuracy (AP):

for 32 GB GPU-VRAM set subdivisions=8 in cfg-file

for 16-24 GPU-VRAM set subdivisions=16 in cfg-file

for 8-12 GB GPU-VRAM set subdivisions=32 in cfg-file (if Out Of Memory occurs - set random=1.34 for [yolo] layers)

Li505358678 commented 4 years ago

I want to know about the chart.png.I trained with coco2014. How many iterations do I need?I can't see loss on chart.png.it's short

AlexeyAB commented 4 years ago

You should train 500 000 iterations on MS COCO https://github.com/AlexeyAB/darknet/blob/0d764e4ffbb9d27e645d9d0a2cbd0a0d589eb534/cfg/yolov4.cfg#L19

Li505358678 commented 4 years ago

What's more,about valid,I saw the valid = coco_testdev in coco.data.What's this?.I need to set it as the train which uses trainvalno5k.txt or use 5k.txt? I think I should use the former by name.

Li505358678 commented 4 years ago

I'm sorry.I'm lazy. I opened them just now.I see.

AlexeyAB commented 4 years ago

Read: https://github.com/AlexeyAB/darknet/wiki/Train-Detector-on-MS-COCO-(trainvalno5k-2014)-dataset

AlexeyAB / darknet

yolov4 coco train avg loss can not change but map normal #5574

Training

width=512

height=512

cutmix=1