bys0318 / SQUIRE

EMNLP 22' (Oral): SQUIRE: A Sequence-to-sequence Framework for Multi-hop Knowledge Graph Reasoning
31 stars 5 forks source link

Question about the label smoothing #8

Closed RongchuanTang closed 9 months ago

RongchuanTang commented 10 months ago

你好,在model.TransformerModel.get_loss函数里实现label smoothing的时候,(1 - self.lable_smooth) / (self.ntoken - 1) * lprobs.sum(dim=-1)这一项乘积是包括了目标token的对数概率吗,似乎与论文里的公式不符,还望解答,感谢~

bys0318 commented 10 months ago

没错,你的观察很准确。考虑到(1 - self.label_smooth)/(self.ntoken - 1)这一系数非常小(大约是self.label_smooth的万分之一),我们在lprobs.sum(dim=-1)中没有特意减去目标token上的log probability,这对于结果影响极小,可以按照论文里的公式来理解。

RongchuanTang commented 10 months ago

哇,明白了,谢谢😜