Closed FortuneSeeker closed 3 years ago
Hi, it is because we use different ways to compute ROUGE-L in the two papers (ROUGE-Lsum v.s. ROUGE-Lsent). You can find the ROUGE-L of the model Ext_summ_LG is also much higher than the original paper. Specifically, the difference is whether the sentences in both ground-truth and generated summary are aggregated with '\n' or ' '. In the former case, the ROUGE-L is on the sentence level, which will result in the ROUGE-L in this paper, and in the latter case, the ROUGE-L is on the summary level, resulting in the ROUGE-L reported in the original ext_summ_lg paper.
Finally, answer in one sentence, we use exactly the same oracle as in the paper 'Extractive Summarization of Long Documents by Combining Global and Local Context', but with a different way to compute ROUGE-L.
oh, I got it! Thank you for the detailed explanation!
Hi, thx for your sharing the code. I have noticed that the ORACLE Rouge-L scores of the two datasets (PubMed & ArXiv) in this paper are totally different from that in your another article "Extractive Summarization of Long Documents by Combining Global and Local Context". Is this difference comes from the way the ORACLE is generated, can you please explain this? Thanks!