LeapLabTHU / Agent-Attention

Official repository of Agent Attention (ECCV2024)
473 stars 35 forks source link

How is it used for the backbone of the Siamese tracking network? #16

Closed C-C-Y closed 8 months ago

C-C-Y commented 8 months ago

May I ask how this can be applied to a Siamese tracking network where the input images are of different sizes and serve as a backbone for weight sharing? I noticed that the agent_num and window parameters are related to the input image size, how can I set them to apply to different input image sizes at the same time?

tian-qing001 commented 8 months ago

Hi @C-C-Y, the agent_num is a predetermined hyperparameter, remaining constant regardless of image resolution. If you wish to support variable sized image input, consider modifying the relevant code and dynamically interpolating model parameters, such as pos_embed, to align with current image sizes. This adaptation has already been implemented in our code for detection and segmentation tasks, and you can find reference implementation there.