lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch
MIT License
1.08k stars 141 forks source link

Performer Plain #84

Open Rachel66666 opened 2 years ago

Rachel66666 commented 2 years ago

Hello, I am trying to adapt the Plain performer. I'm wondering what does the second and third dimension represent? Which one of it is the sequence length? Thank you so much!