shmsw25 / AmbigQA

An original implementation of EMNLP 2020, "AmbigQA: Answering Ambiguous Open-domain Questions"
https://arxiv.org/abs/2004.10645
117 stars 22 forks source link

Questions about _take_mml function #19

Closed WenzhengZhang closed 3 years ago

WenzhengZhang commented 3 years ago

Hi, Thanks for sharing your work! I'm a little confused about the _take_mml function. Could you tell me the reason why you add - 1e10 * (loss_tensor==0).float() here? Thanks!

shmsw25 commented 3 years ago

Thanks for asking. MML is supposed to calculate -log\sum_{y}P(y) and computes the cross entropy loss based on it. In the implementation, loss_tensor contains the cross entropy loss for every y that is equivalent to -logP(y) So we put - and then exp to re-calculate P(y), sum over the answer candidates, and then apply log and - again to convert it back to the valid cross entropy loss.

This doesn't explain - 1e10 * (loss_tensor==0).float() yet. The reason for this term is to remove the impact of dummy answers which losses are 0. (See this line for how it was set to zero.) If we simply use exp, these zeros will be converted to ones. So we want to give a large negative value to these before applying exp so that they will be zeros once exp is applied.

WenzhengZhang commented 3 years ago

Thanks for asking. MML is supposed to calculate -log\sum_{y}P(y) and computes the cross entropy loss based on it. In the implementation, loss_tensor contains the cross entropy loss for every y that is equivalent to -logP(y) So we put - and then exp to re-calculate P(y), sum over the answer candidates, and then apply log and - again to convert it back to the valid cross entropy loss.

This doesn't explain - 1e10 * (loss_tensor==0).float() yet. The reason for this term is to remove the impact of dummy answers which losses are 0. (See this line for how it was set to zero.) If we simply use exp, these zeros will be converted to ones. So we want to give a large negative value to these before applying exp so that they will be zeros once exp is applied.

Thanks a lot for your reply! It's very helpful for me.