dawn-ech / YOLC

[TITS 2024] You Only Look Clusters for Tiny Object Detection in Aerial Images
https://arxiv.org/abs/2404.06180
Apache License 2.0
69 stars 13 forks source link

梯度消失 #14

Open yjyj2000 opened 2 months ago

yjyj2000 commented 2 months ago

2024-09-13 10:32:11,572 - mmdet - INFO - Epoch [4][31250/31563] lr: 2.500e-03, eta: 10 days, 10:41:12, time: 0.403, data_time: 0.004, memory: 8149, loss_center_heatmap: nan, loss_xywh_coarse: nan, loss_xywh_coarse_l1: nan, loss_xywh_refine: nan, loss_xywh_refine_l1: nan, loss: nan, grad_norm: nan 2024-09-13 10:32:31,605 - mmdet - INFO - Epoch [4][31300/31563] lr: 2.500e-03, eta: 10 days, 10:38:22, time: 0.401, data_time: 0.004, memory: 8149, loss_center_heatmap: nan, loss_xywh_coarse: nan, loss_xywh_coarse_l1: nan, loss_xywh_refine: nan, loss_xywh_refine_l1: nan, loss: nan, grad_norm: nan 2024-09-13 10:33:31,678 - mmdet - INFO - Epoch [4][31350/31563] lr: 2.500e-03, eta: 10 days, 10:42:54, time: 1.201, data_time: 0.004, memory: 8149, loss_center_heatmap: nan, loss_xywh_coarse: nan, loss_xywh_coarse_l1: nan, loss_xywh_refine: nan, loss_xywh_refine_l1: nan, loss: nan, grad_norm: nan 2024-09-13 10:34:57,915 - mmdet - INFO - Epoch [4][31400/31563] lr: 2.500e-03, eta: 10 days, 10:52:13, time: 1.725, data_time: 0.004, memory: 8149, loss_center_heatmap: nan, loss_xywh_coarse: nan, loss_xywh_coarse_l1: nan, loss_xywh_refine: nan, loss_xywh_refine_l1: nan, loss: nan, grad_norm: nan 为什么运行到一半的时候,梯度和损失函数的值就消失了,然后每一轮结果都很低,作者有这样的情况吗

yjyj2000 commented 2 months ago

作者您好,此外,运行起来显存大概需要15个G左右,是否正常呢

dawn-ech commented 2 months ago

原始学习率是对应4卡*2bs/卡=8bs,需要针对自己的batch size进行线性缩放。如果你的总batch size是2,需要将学习率除以4。显存应该是正常的