Should bias be enabled in transformer?

OpenNMT / OpenNMT-py

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

https://opennmt.net/

MIT License

6.76k stars 2.25k forks source link

Should bias be enabled in transformer? #1173

Closed hfxunlp closed 5 years ago

hfxunlp commented 5 years ago

The bias in nn.Linear is enabled by default, and according to Sockeye, seems bias in MultiHeadAttention and also here should be disabled.

But I am not sure, since I have difficulty in find related code in tensor2tensor.

guillaumekln commented 5 years ago

Bias is indeed not used in the paper, but I don't think this makes an interesting difference. For example, I remember that the reference implementation (tensor2tensor) did use a bias at some point.

hfxunlp commented 5 years ago

@guillaumekln got it. Thank you for your reply.