SysCV / qdtrack

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)
Apache License 2.0
381 stars 61 forks source link

I'm confusing with the meaning of auxiliary loss #111

Closed hcv1027 closed 2 years ago

hcv1027 commented 2 years ago

Hi , thanks for your great work. According to the paper, There is an auliliary loss, I do not really understand the intuition of this loss. 螢幕擷取畫面 (9)

Can you give me some more explanation of this loss? Thanks.

OceanPang commented 2 years ago

Normally we have a normalized dot product with a softmax temperature control in the contrastive loss. The softmax temperature needs to be tuned.

In this word, we directly use dot product with normalization in contrastive loss but with an auliliary loss. There are no extra parameters to tune.

hcv1027 commented 2 years ago

Hi @OceanPang , thanks for your explanation.

I'm trying to change the detector from Faster-RCNN to yolox-tiny, and I currently suffer from a strange gradient explored problem. The loss_track will increase very quick after start training and then gradient explored like below log, the loss_track is 173.1202.

2022-07-27 09:56:16,232 - mmtrack - INFO - Epoch [1][50/324] lr: 9.526e-06, eta: 4:37:54, time: 0.172, data_time: 0.061, memory: 3588, loss_cls: 0.9667, loss_bbox: 4.7251, loss_obj: 13.7270, loss_track: 173.1202, loss_track_aux: 0.5976, loss: 193.1366

After analysis the output of each head, I find that the output range of embed head will go larger and larger after each iteration. And it will cause the dot-product similarity become very large. And it will only happen as I use yolox-tiny as detector, faster-rcnn detector will work fine.

And I also try to close auxiliary loss or change dot-product similarity to cosine similarity, these two change will not have gradient explored problem, but the loss_track still go larger and larger during training, like below log shows, the loss_track increase from 0.8004 at 1st epoch to 1.0551 at the end of 11-th epoch.

2022-07-27 09:57:42,667 - mmtrack - INFO - Epoch [1][50/324] lr: 9.526e-06, eta: 4:39:48, time: 0.173, data_time: 0.059, memory: 3613, loss_cls: 1.0192, loss_bbox: 4.7018, loss_obj: 14.2081, loss_track: 0.8004, loss: 20.7295 2022-07-27 09:57:48,593 - mmtrack - INFO - Epoch [1][100/324] lr: 3.810e-05, eta: 3:55:43, time: 0.119, data_time: 0.008, memory: 3613, loss_cls: 1.1430, loss_bbox: 4.6036, loss_obj: 12.6540, loss_track: 0.7827, loss: 19.1834 2022-07-27 09:57:54,422 - mmtrack - INFO - Epoch [1][150/324] lr: 8.573e-05, eta: 3:39:54, time: 0.117, data_time: 0.008, memory: 3613, loss_cls: 1.1570, loss_bbox: 4.5119, loss_obj: 9.8548, loss_track: 0.7408, loss: 16.2645 2022-07-27 09:58:00,321 - mmtrack - INFO - Epoch [1][200/324] lr: 1.524e-04, eta: 3:32:31, time: 0.118, data_time: 0.008, memory: 3624, loss_cls: 1.1075, loss_bbox: 4.3599, loss_obj: 8.8974, loss_track: 0.7564, loss: 15.1212 2022-07-27 09:58:05,950 - mmtrack - INFO - Epoch [1][250/324] lr: 2.381e-04, eta: 3:26:18, time: 0.113, data_time: 0.008, memory: 3633, loss_cls: 1.0665, loss_bbox: 4.2638, loss_obj: 6.6571, loss_track: 0.7502, loss: 12.7375 2022-07-27 09:58:13,184 - mmtrack - INFO - Epoch [1][300/324] lr: 3.429e-04, eta: 3:30:46, time: 0.145, data_time: 0.008, memory: 3633, loss_cls: 0.9996, loss_bbox: 4.1685, loss_obj: 6.2407, loss_track: 0.7471, loss: 12.1559

2022-07-27 10:06:20,565 - mmtrack - INFO - Epoch [11][50/324] lr: 9.992e-03, eta: 3:10:07, time: 0.178, data_time: 0.060, memory: 4040, loss_cls: 0.8197, loss_bbox: 3.3790, loss_obj: 2.6271, loss_track: 1.0395, loss: 7.8652 2022-07-27 10:06:26,861 - mmtrack - INFO - Epoch [11][100/324] lr: 9.992e-03, eta: 3:10:07, time: 0.126, data_time: 0.008, memory: 4040, loss_cls: 0.8079, loss_bbox: 3.2386, loss_obj: 2.6984, loss_track: 1.0834, loss: 7.8283 2022-07-27 10:06:32,838 - mmtrack - INFO - Epoch [11][150/324] lr: 9.991e-03, eta: 3:09:58, time: 0.120, data_time: 0.008, memory: 4040, loss_cls: 0.8197, loss_bbox: 3.3582, loss_obj: 2.7088, loss_track: 1.0186, loss: 7.9053 2022-07-27 10:06:38,983 - mmtrack - INFO - Epoch [11][200/324] lr: 9.991e-03, eta: 3:09:54, time: 0.123, data_time: 0.009, memory: 4096, loss_cls: 0.7778, loss_bbox: 3.2397, loss_obj: 2.5475, loss_track: 1.0808, loss: 7.6459 2022-07-27 10:06:45,668 - mmtrack - INFO - Epoch [11][250/324] lr: 9.990e-03, eta: 3:10:04, time: 0.134, data_time: 0.008, memory: 4096, loss_cls: 0.7841, loss_bbox: 3.2447, loss_obj: 2.5895, loss_track: 1.0685, loss: 7.6868 2022-07-27 10:06:51,789 - mmtrack - INFO - Epoch [11][300/324] lr: 9.990e-03, eta: 3:09:59, time: 0.122, data_time: 0.008, memory: 4096, loss_cls: 0.8095, loss_bbox: 3.2379, loss_obj: 2.5245, loss_track: 1.0551, loss: 7.6270

Do you have any idea of this problem? Thanks for your sharing.

OceanPang commented 2 years ago

Try to use grad_clip?

hcv1027 commented 2 years ago

Try to use grad_clip?

Hi @OceanPang , it works fine, thanks for your suggestion.