microsoft / VideoX

VideoX: a collection of video cross-modal models
Other
967 stars 160 forks source link

The Transformer Head of SeqTrack #103

Closed Tchuanm closed 1 year ago

Tchuanm commented 1 year ago

Hi, @chenxin-dlut

with I check the code and find there also exist a method which can choose the model: fed seatch and tempalte patches into Decoder. Have you do the ablation of fed template-search patches together vs. search patches only into Deooder. The performance will reduce obviously or not for the former ?

Thanks for your work.

chenxin-dlut commented 1 year ago

Hi, This method performs similar to our default method, you can find the results in the ablation study of our paper.

chenxin-dlut commented 1 year ago

Hi, This method performs similar to our default method, you can find the results in the ablation study of our paper.

Tchuanm commented 1 year ago

Great, Maybe I'm missiing it in paper.