thunlp / GEAR

Source code for ACL 2019 paper "GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification"
MIT License
98 stars 25 forks source link

The the number of MLP in ERNet #12

Closed iambabao closed 4 years ago

iambabao commented 4 years ago

论文中提到在ERNet中利用MLP计算attention,从论文来理解是每层ERNet会包含两个参数$W{0}^{t}$和$W{1}^{t}$用于MLP。 但是从代码实现上,好像是为每层ERNet的每个节点都初始化了两个参数$W{0}$和$W{1}$:

# each SelfAttentionLayer cantains two Linear
self.attentions = [SelfAttentionLayer(nhid=nhid * 2, nins=nins) for _ in range(nins)]

所以MLP的参数在层内不是共享的吗?

jayzzhou-thu commented 4 years ago

@iambabao 这里应该是每个节点单独计算了不同的attention,论文中的表述可能忽略了这个细节,抱歉造成了歧义