Rishit-dagli / Nystromformer

An implementation of the Nyströmformer, using Nystrom method to approximate standard self attention
Apache License 2.0
55 stars 4 forks source link

Feed Forward layers #4

Closed Rishit-dagli closed 2 years ago

Rishit-dagli commented 2 years ago

https://github.com/Rishit-dagli/Nystromformer/commit/a4f187579241e202bc046fe8eac58311a264e3f4