AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.77k stars 7.96k forks source link

it's overfitting?? #4916

Open EtheneXiang opened 4 years ago

EtheneXiang commented 4 years ago

I trained a 19 cls detector with csresnext50-panet-spp-original-optimal.cfg, but i find something is wrong. according to the plot, it's ovetfiiting? @AlexeyAB

AlexeyAB commented 4 years ago

Either overfitting
or AP50 decreases, but AP75 increases

EtheneXiang commented 4 years ago

my train imgs over 80k, including 800k Objs and i set those paras in cfg : angle=7 saturation = 1.5 exposure = 1.5 hue=.1 mosaic=1

EtheneXiang commented 4 years ago

if it's overfitting, can i reduce All convolutional layer channels to half . i do not have any good idea

EtheneXiang commented 4 years ago

and i do not understand your said, AP75 increases. Is the MAP calculated during network training not just in IOU=0.5?

AlexeyAB commented 4 years ago

Is the MAP calculated during network training not just in IOU=0.5?

just IOU=0.5

scianand commented 4 years ago

Hi @AlexeyAB ,

I am training Yolov3-tiny-pan-lstm model for custom dataset. I have only 3 video sequences 400 images each. I have calculated custom anchors. sequential_subdivisions=4 batch=64 and subdivison =4.

But the training loss is oscillating between 0.17-0.20. Learning rate is 0.0001. please can you help me in this issue. how can I know which learning rate is good for my dataset.

scianand commented 4 years ago

Hi @AlexeyAB ,

Please can you tell me about these parameters and how can I use those to train my model. track=1 time_steps=3 # for 8GB GPU augment_speed=3 sequential_subdivisions=4

learning_rate=0.0001 burn_in=1000 max_batches = 10000

policy=sgdr sgdr_cycle=1000 sgdr_mult=2 steps=4000,6000,8000,9000

scales=1, 1, 0.1, 0.1

seq_scales=0.5, 1, 0.5, 1

AlexeyAB commented 4 years ago

learning_rate=0.0001

Whu you don't use default learning rate?

scianand commented 4 years ago

you mean 0.001?

scianand commented 4 years ago

I have tried to use that too. But it was not converging. I have tried to calculate mAP with training. But this error is showing.

(next mAP calculation at 3900 iterations) 3900: 0.160964, 0.146414 avg loss, 0.000881 rate, 0.549139 seconds, 748800 images

calculation mAP (mean average precision)... 4CUDA Error Prev: an illegal memory access was encountered CUDA Error Prev: an illegal memory access was encountered: Success darknet: ./src/utils.c:293: error: Assertion `0' failed.

@AlexeyAB Please can you help me?