lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch
MIT License
1.08k stars 141 forks source link

way to make two elements invisible? #79

Open 1140310118 opened 2 years ago

1140310118 commented 2 years ago

Hey, thanks for your contributions for github community. I wonder, in linear-attention, is there a way to make two elements of the sequence invisible?

In the original self-attention, this is easy, only by modifying the attention matrix to achieve. But in linear-attention, I can't find a way to achieve this goal, because there is no attention matrix. Therefore, I would like to ask if there is a way to achieve this by modifying q, k, v.