lucidrains / performer-pytorch

An implementation of Performer, a linear attention-based transformer, in Pytorch
MIT License
1.08k stars 141 forks source link

`to_out` bias #65

Closed JamesDeAntonis closed 3 years ago

JamesDeAntonis commented 3 years ago

Hi again,

Could you make it qkvo_bias instead? We would also like to not have bias for to_out

Thanks!

lucidrains commented 3 years ago

@JamesDeAntonis interesting! usually the outward projection does have bias - can you point me at the implementation where it does not, or is this some in-house architecture?

lucidrains commented 3 years ago

@JamesDeAntonis https://github.com/lucidrains/performer-pytorch/releases/tag/1.0.7 attn_out_bias = False for you and your team

JamesDeAntonis commented 3 years ago

thanks!