megvii-research / video_analyst

A series of basic algorithms that are useful for video understanding, including Single Object Tracking (SOT), Video Object Segmentation (VOS) and so on.
MIT License
832 stars 176 forks source link

There is a question about the calculation of LOSS in SOT training #97

Closed TCBocean closed 4 years ago

TCBocean commented 4 years ago

I found that the mask of ctr loss is a non-background area in any case. But in my opinion, if it is a neg pair, there is no need to calculate the ctr loss of the non-background area. What do you think?

lzx1413 commented 4 years ago

@TCBocean the area where is not in the box will be set to 0 in this line https://github.com/MegviiDetection/video_analyst/blob/4c5e03a8f8e98aeb917432f18ede706935c6ac47/videoanalyst/data/target/target_impl/utils/make_densebox_target.py#L134

TCBocean commented 4 years ago

@lzx1413 Yes, when creating the center label, the background area (the area outside the bounding box of the target) is set to 0, while the foreground area (the area within the bounding box of the target) will be greater than 0. When calculating the loss , Only the part of the center label that is not equal to 0 is calculated. For the neg pair, it will also have a bounding box, so its center label will also have a part that is not equal to 0, and this part is also counted as center loss. And I think that for the label of the center of the neg pair, there should be no response (should be all 0s) at the target, which is more in line with my understanding. Or is there something in the code that I missed?

lzx1413 commented 4 years ago

@TCBocean You are right, the ctr should set to zero while in neg pairs.