Closed CeZh closed 2 years ago
MOTR relies on query self-attention in Deformable DETR Decoder to suppress object queries from detecting tracked objects. The mechanism for duplicate removal is similar to the original (Deformable) DETR.
Hello, Thank you for your quick reply. So correct me if I'm wrong, the deformable detr basically distinguish the "tracking objects" and "new born objects". According to my understanding, the QIM module and the extra tracking attention is designed for improving the tracking accuracy. So does that mean with only the post-processing (track management) and the deformable detr, the algorithm is able to track objects but just with poor performance? Thank you
hello, after I re-read the paper and reinspect the code, I think I have a better understanding of the track query and the object queries. I will close this repo. Thank you so much for your explanation.
hello, after I re-read the paper and reinspect the code, I think I have a better understanding of the track query and the object queries. I will close this repo. Thank you so much for your explanation.
Hi! When I checked the evaluation code, I found that new born object queries would have very low detection scores, while existing objects would not be detected. However, the visualization found that there are actually many new born query reference points that are very close to the existing tracking query reference points. According to the reference feature aggregation process of deformable DETR, their aggregated features are similar intuitively, but the final detection scores are very different. big. I still haven't been able to understand why object queries can be suppressed from detecting tracked objects. Can I ask for your understanding?
hello, after I re-read the paper and reinspect the code, I think I have a better understanding of the track query and the object queries. I will close this repo. Thank you so much for your explanation.
Hi! When I checked the evaluation code, I found that new born object queries would have very low detection scores, while existing objects would not be detected. However, the visualization found that there are actually many new born query reference points that are very close to the existing tracking query reference points. According to the reference feature aggregation process of deformable DETR, their aggregated features are similar intuitively, but the final detection scores are very different. big. I still haven't been able to understand why object queries can be suppressed from detecting tracked objects. Can I ask for your understanding?
I found the same phenomenon. The detection result score of the same target in the current frame is much lower than that in the first frame
hello, after I re-read the paper and reinspect the code, I think I have a better understanding of the track query and the object queries. I will close this repo. Thank you so much for your explanation.
hello,Can you help explain why, or how to understand this sentence “ MOTR relies on query self-attention in Deformable DETR Decoder to suppress object queries from detecting tracked objects. The mechanism for duplicate removal is similar to the original (Deformable) DETR.” from author
Thanks for the impressive work! I think both the code and the paper are very interesting and insightful. However, I have a question about the track query and object query design. According to my understanding, when doing tracking, you concatenated the track queries and the object queries for existing and new born object detection and tracking. I am a little bit confused that how do you prevent the new born object queries detect the existing objects? To better describe, I just show a small example in this discussion.
For example, the frame_0 (initial frame) detected 2 objects and these 2 objects' corresponding query features are concatenated to the new born queries, which yields the frame_1 (next frame) object queries to be 302*256 where 302 is the total number of object queries. Since the first 300 object queries positions and features are random-initialized, is there a mechanism or module in the MOTR to prevent these 300 object queries not re-detect the 2 objects that you detected from the initial frame? Thank you so much!