Open liuqk3 opened 2 years ago
The query_embed
is the encoding for the decoder to be able to differentiate the object queries. We add a zero-encoding to track queries as they are already refined and more easily differentiable by the decoder. However, I agree this might not be the ideal solution. We tried learning fixed track query encodings or adding the query output from the previous frame as encoding but none of those gave better results.
Hi, thanks for your great works!
I found that
prev_query_embed
of track query indeformable_transformer.py
https://github.com/timmeinhardt/trackformer/blob/df70fef0539dc6ebe8ed26bf1ce55dd6e8f87968/src/trackformer/models/deformable_transformer.py#L214
is set to zeros. However, the
query_embed
of detection query is learned end-to-end, which is in fact the postional embeddings. Why you do such settings? From the commented lines (line 215-220), it seems that you have tried different settings ofprev_tgt
andprev_query_embed
. Does the performance differ a lot with these different settings?