Closed hcv1027 closed 2 years ago
Normally we have a normalized dot product with a softmax temperature control in the contrastive loss. The softmax temperature needs to be tuned.
In this word, we directly use dot product with normalization in contrastive loss but with an auliliary loss. There are no extra parameters to tune.
Hi @OceanPang , thanks for your explanation.
I'm trying to change the detector from Faster-RCNN to yolox-tiny, and I currently suffer from a strange gradient explored problem.
The loss_track
will increase very quick after start training and then gradient explored like below log, the loss_track
is 173.1202.
2022-07-27 09:56:16,232 - mmtrack - INFO - Epoch [1][50/324] lr: 9.526e-06, eta: 4:37:54, time: 0.172, data_time: 0.061, memory: 3588, loss_cls: 0.9667, loss_bbox: 4.7251, loss_obj: 13.7270, loss_track: 173.1202, loss_track_aux: 0.5976, loss: 193.1366
After analysis the output of each head, I find that the output range of embed head will go larger and larger after each iteration. And it will cause the dot-product similarity become very large. And it will only happen as I use yolox-tiny as detector, faster-rcnn detector will work fine.
And I also try to close auxiliary loss or change dot-product similarity to cosine similarity, these two change will not have gradient explored problem, but the loss_track
still go larger and larger during training, like below log shows, the loss_track
increase from 0.8004 at 1st epoch to 1.0551 at the end of 11-th epoch.
2022-07-27 09:57:42,667 - mmtrack - INFO - Epoch [1][50/324] lr: 9.526e-06, eta: 4:39:48, time: 0.173, data_time: 0.059, memory: 3613, loss_cls: 1.0192, loss_bbox: 4.7018, loss_obj: 14.2081, loss_track: 0.8004, loss: 20.7295 2022-07-27 09:57:48,593 - mmtrack - INFO - Epoch [1][100/324] lr: 3.810e-05, eta: 3:55:43, time: 0.119, data_time: 0.008, memory: 3613, loss_cls: 1.1430, loss_bbox: 4.6036, loss_obj: 12.6540, loss_track: 0.7827, loss: 19.1834 2022-07-27 09:57:54,422 - mmtrack - INFO - Epoch [1][150/324] lr: 8.573e-05, eta: 3:39:54, time: 0.117, data_time: 0.008, memory: 3613, loss_cls: 1.1570, loss_bbox: 4.5119, loss_obj: 9.8548, loss_track: 0.7408, loss: 16.2645 2022-07-27 09:58:00,321 - mmtrack - INFO - Epoch [1][200/324] lr: 1.524e-04, eta: 3:32:31, time: 0.118, data_time: 0.008, memory: 3624, loss_cls: 1.1075, loss_bbox: 4.3599, loss_obj: 8.8974, loss_track: 0.7564, loss: 15.1212 2022-07-27 09:58:05,950 - mmtrack - INFO - Epoch [1][250/324] lr: 2.381e-04, eta: 3:26:18, time: 0.113, data_time: 0.008, memory: 3633, loss_cls: 1.0665, loss_bbox: 4.2638, loss_obj: 6.6571, loss_track: 0.7502, loss: 12.7375 2022-07-27 09:58:13,184 - mmtrack - INFO - Epoch [1][300/324] lr: 3.429e-04, eta: 3:30:46, time: 0.145, data_time: 0.008, memory: 3633, loss_cls: 0.9996, loss_bbox: 4.1685, loss_obj: 6.2407, loss_track: 0.7471, loss: 12.1559
2022-07-27 10:06:20,565 - mmtrack - INFO - Epoch [11][50/324] lr: 9.992e-03, eta: 3:10:07, time: 0.178, data_time: 0.060, memory: 4040, loss_cls: 0.8197, loss_bbox: 3.3790, loss_obj: 2.6271, loss_track: 1.0395, loss: 7.8652 2022-07-27 10:06:26,861 - mmtrack - INFO - Epoch [11][100/324] lr: 9.992e-03, eta: 3:10:07, time: 0.126, data_time: 0.008, memory: 4040, loss_cls: 0.8079, loss_bbox: 3.2386, loss_obj: 2.6984, loss_track: 1.0834, loss: 7.8283 2022-07-27 10:06:32,838 - mmtrack - INFO - Epoch [11][150/324] lr: 9.991e-03, eta: 3:09:58, time: 0.120, data_time: 0.008, memory: 4040, loss_cls: 0.8197, loss_bbox: 3.3582, loss_obj: 2.7088, loss_track: 1.0186, loss: 7.9053 2022-07-27 10:06:38,983 - mmtrack - INFO - Epoch [11][200/324] lr: 9.991e-03, eta: 3:09:54, time: 0.123, data_time: 0.009, memory: 4096, loss_cls: 0.7778, loss_bbox: 3.2397, loss_obj: 2.5475, loss_track: 1.0808, loss: 7.6459 2022-07-27 10:06:45,668 - mmtrack - INFO - Epoch [11][250/324] lr: 9.990e-03, eta: 3:10:04, time: 0.134, data_time: 0.008, memory: 4096, loss_cls: 0.7841, loss_bbox: 3.2447, loss_obj: 2.5895, loss_track: 1.0685, loss: 7.6868 2022-07-27 10:06:51,789 - mmtrack - INFO - Epoch [11][300/324] lr: 9.990e-03, eta: 3:09:59, time: 0.122, data_time: 0.008, memory: 4096, loss_cls: 0.8095, loss_bbox: 3.2379, loss_obj: 2.5245, loss_track: 1.0551, loss: 7.6270
Do you have any idea of this problem? Thanks for your sharing.
Hi , thanks for your great work. According to the paper, There is an auliliary loss, I do not really understand the intuition of this loss.
Can you give me some more explanation of this loss? Thanks.