lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch
MIT License
1.08k stars 141 forks source link

How to test the performer architecture for training new models? #83

Open ayan-iiitd opened 2 years ago

ayan-iiitd commented 2 years ago

The toy example shows the copy function, while the relatively smaller example is a compression problem. Can you please describe how to test the performer attention for a seqtoseq model, like how exactly can I pass a pair of text data, say an English sentence and its French translation and how the workflow, english_sentence -> tokenized -> embed -> encoder -> decoder -> prediction works, or else how do we pass a embedding the performer_enc.

ayan-iiitd commented 2 years ago

Also given that the PerformerLM class has the line self.token_emb = nn.Embedding(num_tokens, dim) so I think there must be a way to pass text data to this class for seqtoseq.