Closed Jexxie closed 3 years ago
Thank you for your interest in our work! Below is the code snippet for the ranking loss:
This function indeed computes the loss described in our paper.
To see this, you may try to focus on a score of a single candidate summary (e.g. the first one, which has the best ROUGE score), and figure out which score it is compared with (line 21) as the loop (line 14) is going. You will see the score of the first candidate is compared with all the other candidates when the loop is completed and the value of the margin is corresponding to j - i
(in this special case i = 0
).
Thanks for your reply!! But there are still some questions.
I'll answer the second question first as I think it would be helpful for you to understand the first question.
Indeed there are some scores that have been chosen twice, but they have different roles in these two cases. Namely, they are either chosen to be compared with a score higher than them or a score lower than them. In line 21, for every item of pos_score
and neg_score
, the corresponding item of pos_score
is always greater than neg_score
.
As for the first question, I hope this hint will help you to come up with an answer yourself: In line 20, i
serves the same purpose as j-i
in the equation listed in the paper.
Another hint may also be helpful: the loop (line 14) in the code is not corresponding to the outer summation in the equation. On the other hand, a summation step that corresponds to a specific value of i
in the equation is only finished when the loop in the code is completed.
Please let me know if you have more questions! :)
Thanks for your excellent work. I have a question about loss computation. Is there any difference between the loss function in the paper and the code part? The loss function in the pape:
But the code part:
It seems the code part just computes +ilamda instead of +(j-i) @lamda. Did I miss something?