LeapLabTHU / Agent-Attention

Official repository of Agent Attention (ECCV2024)
473 stars 35 forks source link

application in cross transformer #12

Open XCZhou520 opened 8 months ago

XCZhou520 commented 8 months ago

Hello,

I've been exploring your work on Cross Transformer, and I'm intrigued by the potential of integrating Agent Attention into this architecture. Agent Attention, as a method to balance computational efficiency and representation power, seems like it could complement the Cross Transformer's design quite well.

I'm particularly interested in understanding how Agent Attention might be incorporated into the Cross Transformer framework. Specifically, my questions are:

  1. How many agent tokens would be optimal in the context of Cross Transformer, and how should they be initialized?
  2. In the existing architecture, what would be the best way to integrate agent tokens in terms of the attention mechanism – should they replace or complement the current attention queries or keys?
  3. Are there any specific considerations or potential challenges you foresee in adapting Agent Attention to this context?

Any insights or suggestions you could provide would be greatly appreciated. I believe such an integration could offer a promising direction for further research and application.

Thank you for your time and for the impactful work you've shared with the community.

Best regards,

tian-qing001 commented 8 months ago

Hi @XCZhou520, thank you for your insightful attention to our work. I would appreciate some clarification on whether your mention of "Cross Transformer" pertains to cross attention or to a specific work named Cross Transformer. Understanding this distinction will allow me to provide more accurate and helpful information.