Closed johndpope closed 4 months ago
Hi, we are a little busy these days rewriting https://github.com/lab-ml/app and I'm down with fever. We'll look into performers when are free. It'll take sometime since it's not a paper we are familiar with (I have only skimmed through it when it came out).
I was having a brief look at Nystromformer https://arxiv.org/pdf/2102.03902.pdf , which also seems interesting and better performance than performers. What do you think of that before performers?
I can't coment on Nystromformer - but there are existing pytorch libraries for reformer models - that maybe can be cherry picked - and comented. https://github.com/search?q=performer+pytorch&type=repositories
I'm very interested in CrissCross attention / any help to bridge the linear algebra maths to the code would be amazing. GPU memory friendly High computational efficiency The state-of-the-art performance https://github.com/speedinghzl/CCNet
Thanks, will try to do performers when we get some free time
https://www.youtube.com/watch?v=xJrKIPwVwGM&t=767s
https://ai.googleblog.com/2020/10/rethinking-attention-with-performers.html