Zzh-tju / DIoU-darknet

Distance-IoU Loss into YOLO v3
GNU General Public License v3.0
313 stars 85 forks source link

训练发散的问题,损失为-nan #11

Closed WenlongL closed 4 years ago

WenlongL commented 4 years ago

您好,我用您的代码使用CIOU,用darknet53.conv.74初始化,每次训练一会后就会出现发散,损失为-nan

Zzh-tju commented 4 years ago

你的学习率这些设置是多少?cfg

WenlongL commented 4 years ago

batch=64 subdivisions=16 width=416 height=416 channels=3 momentum=0.9 decay=0.0005 angle=0 saturation = 1.5 exposure = 1.5 hue=.1

learning_rate=0.001 burn_in=1000 max_batches = 20000 policy=steps steps=3000 scales=.1

[yolo] mask = 0,1,2 anchors = 10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326

classes=20

classes=1

classes=8 num=9 jitter=.3 ignore_thresh = .5 truth_thresh = 1 random=1 iou_loss=ciou cls_normalizer=1 iou_normalizer=0.5 nms_kind=diounms beta1=0.6

Zzh-tju commented 4 years ago

你的GPU是?

WenlongL commented 4 years ago

只有一个GPU 型号:Quadro M4000

Zzh-tju commented 4 years ago

只有一块GPU是吗

WenlongL commented 4 years ago

是的

WenlongL commented 4 years ago

我服务器上有两块GPU,但是我在训练时使用-gpus 指定了只使用一块GPU

Zzh-tju commented 4 years ago

设定都没有错,你应该抽查一下数据集的标签与图片是否对得上。并且建议换个损失如MSE先训练个几千轮看看会不会爆炸。

WenlongL commented 4 years ago

标签应该没问题,我在原版darknet上用这个数据集做过好多次了

Zzh-tju commented 4 years ago

换损失试试吧