Open tragians opened 1 year ago
We provide the MOTS20 results for the old model cause the deformable attention seemed to perform worse for segmentation. Multi-frame and multi-scale trainings were not part of the old model. However, there is no reason why multi-frame could not work for segmentation.
Thank you very much for your detailed answer!
I have a follow up question on the slightly modified Transformer Class you introduce. I was wondering what the use of the parameter track_attention is and whether it was used during training.
The track_attention
is a legacy parameter and was not used during any of the trainings.
Hi @timmeinhardt , thanks so much for this great work!
While trying to reproduce the results for MOTS20, I noticed some differences between your DeformableDETR and the DETR implementations.
Could you explain the use of args.multi_frame_attention in the adjusted DeformableDETR? I'm wondering why it is not used in the DETR based model for mask tracking.
Is multi frame attention not necessary to utilise track queries in the model? I read section 4.2 in the paper, but I'm still a bit confused.