attention score计算的疑问

fastnlp / TENER

Codes for "TENER: Adapting Transformer Encoder for Named Entity Recognition"

373 stars 55 forks source link

Open wangyuehu opened 3 years ago

wangyuehu commented 3 years ago

请问代码relative_transformer.py种127行的E_如何理解，看了论文中在计算attention score时似乎没有这一项？

yhcc commented 3 years ago

嗯，论文中没有这一项。我们经验性的发现这一项可以让训练效果更加稳定，所以就在新版的代码中添加了这一项。可以理解为需要知道当前key和query的相对位置来决定对这个key的bias。

iamqiz commented 2 years ago

原来是这样啊,我看了半天,幸好跑来github看到了这个issue > <