score of softmax on Text4k; linformer-256 & nystrom-64 doesn't work

Hi,

Thanks for the excellent work!

I found some issues in my humble trials (I didn't change anything in the code):

using softmax attention on Text4k I got ~63.7 acc instead of 65.02 you posted in your paper.
again I tried linear attention Text4k I got ~64 acc, it's even higher than vanilla transformer, I wonder did you get the same result from your side?
the attention types linformer-256 and nystrom-64 doesn't work, the errors are either dimensions mismatching or config key error. It seems like not all the attention types can successfully run when you release the code. Btw I didn't try out all the choices.

Thank you for your time, I look forward to your reply~

Ziwei

mlpen / Nystromformer