WillDreamer / Aurora

[NeurIPS2023] Parameter-efficient Tuning of Large-scale Multimodal Foundation Model
https://arxiv.org/abs/2305.08381
83 stars 7 forks source link

About Gated Query Transformation #2

Closed 123456789asdfjkl closed 1 year ago

123456789asdfjkl commented 1 year ago

Hi! Thanks for your great work! I want to know how the gated mechanism is implemented? image

xinlong-yang commented 1 year ago

Thanks for your interest. After transforming t to t' through a scale&shift operation, we get paired (f, t') which has shape (seq_len, hiddendim), then g can be computed through codes like following: `g = torch.softmax(torch.sum(torch.matmul(f,t'.T),dim=1),dim=0).unsqueeze(1)`

123456789asdfjkl commented 1 year ago

I see. Thanks for your answer!

Arsiuuu commented 1 year ago

@123456789asdfjkl Hello~ have you reproduced successfully?