Closed LNoving closed 3 years ago
@LNoving I have tried hard negative example mining and that did not change the result significantly. In my opinion, the performance of trackers has been bounded with its ability of reasoning in temporal dimension, like deciding which track (sequence of tracklet) belongs to the target (c.f. Siam RCNN). Thus, changing model architecture or training strategy only brings limited improvement. Instead, an effective way to exploit temporal cue should draw more attention.
@MARMOTatZJU How did you perform the mining? I tried input a batch of positive pairs, sample negative pairs from different images in the same batch, then sort these negative pairs and use top k to calculate loss. However, by this way, the model cannot converge very well and lead to some bad output when testing. But when I change the sort into random indexing, the result is ok. By the way, focal loss also lead to the same problem. So I wonder what method did you use when do the mining, did you sample negative examples in mini-batch? also sent you an e-mail.
@LNoving Your description "But when I change the sort into random indexing, the result is ok" implies a possibility that your code may have some bug. If I were you, I would start debugging by printing some intermediate results.
Issue closed as no longer discussion proceeds. Feel free to reopen it.
Sampling negative pairs from images in the same batch and do some hard negative example mining is a common trick in image retrieval tasks. However in siamese network I didn't find anyone use it, they all make negative pairs only in dataloader. I have tried sample different pairs in batch as negative example and then sort them to get hard negative examples, but it doesn't work very well. So are there some specific reasons that we don't use this trick in tracking tasks, or just no one would like to try?