AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.75k stars 7.96k forks source link

Trainning questions about the loss sudden increasing? #3764

Open yeyewen opened 5 years ago

yeyewen commented 5 years ago

@AlexeyAB Hi,I trained YOLOv3tiny-pan2.cfg。The loss got a sudden increase just like the following chart image For some reason,I add the spp to YOLOv3tiny-pan2. and I trained, this situation has also happened. image But when I train other networks,like yolo_v3_tiny_3l,yolo_v3_tiny.The loss are normal.

So do you know what might be the cause ? It will be very helpful for me. Thank you.

AlexeyAB commented 5 years ago

Can you attach your cfg file?

yeyewen commented 5 years ago

Yes, YOLOv3tiny-pan2 is what you posted. `[net] batch=64 subdivisions=4 width=800 height=480 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 30000

policy=sgdr sgdr_cycle=1000 sgdr_mult=2 steps=20000, 26000 scales=0.1,0.1

[convolutional] batch_normalize=1 filters=16 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=32 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[maxpool] size=2 stride=1

[convolutional] batch_normalize=1 filters=1024 size=3 stride=1 pad=1 activation=leaky

############ SPP [maxpool] stride=1 size=5

[route] layers=-2

[maxpool] stride=1 size=9

[route] layers=-4

[maxpool] stride=1 size=13

[route] layers=-1,-3,-5,-6 ########## End SPP

###########

[convolutional] batch_normalize=1 filters=256 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

########### to [yolo-3]

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 8

###########

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

########### to [yolo-2]

[convolutional] batch_normalize=1 filters=128 size=1 stride=1 pad=1 activation=leaky

[upsample] stride=2

[route] layers = -1, 6

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

########### features of different layers

[route] layers=1

[maxpool] size=16 stride=16

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers=3

[maxpool] size=8 stride=8

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers=5

[maxpool] size=4 stride=4

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers=7

[maxpool] size=2 stride=2

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers=9

[convolutional] batch_normalize=1 filters=64 size=1 stride=1 pad=1 activation=leaky

[route] layers=-1, -3, -6, -9, -12, 18

[maxpool] maxpool_depth=1 out_channels=64 stride=1 size=1

########### [yolo-1]

[upsample] stride=4

[route] layers = -1,30

[convolutional] batch_normalize=1 filters=128 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=45 activation=linear

[yolo] mask = 0,1,2 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 classes=10 num=9 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=0

########### [yolo-2]

[route] layers = -6

[upsample] stride=2

[route] layers = -1,26

[convolutional] batch_normalize=1 filters=256 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=45 activation=linear

[yolo] mask = 3,4,5 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 classes=10 num=9 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=0

########### [yolo-3]

[route] layers = -12

[route] layers = -1,20

[convolutional] batch_normalize=1 filters=512 size=3 stride=1 pad=1 activation=leaky

[convolutional] size=1 stride=1 pad=1 filters=45 activation=linear

[yolo] mask = 6,7,8 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326 classes=10 num=9 jitter=.3 ignore_thresh = .7 truth_thresh = 1 random=0`

AlexeyAB commented 5 years ago

Try to change

policy=sgdr
sgdr_cycle=1000
sgdr_mult=2
steps=20000, 26000
scales=0.1,0.1

to

policy=steps
steps=20000, 26000
scales=0.1,0.1
yeyewen commented 5 years ago

Thanks a lot.I tried,but it seems not helpful. I also found that when iteration > 15000 ,it won't happen. I still don't know why。