kavgan / ROUGE-2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.
https://kavgan.github.io/ROUGE-2.0
Apache License 2.0
209 stars 37 forks source link

What is the difference between ROUGE 2 and ROUGE 1.5.5? #18

Closed yogesh-iitj closed 4 years ago

yogesh-iitj commented 5 years ago

We are getting different results for ROUGE 2 and ROUGE 1.5.5( mainly for ROUGE L) while evaluating our summaries(same for both) for same property settings for both of the versions.

kavgan commented 5 years ago

Do you mean perl vs. java? Can you paste the differences.

I know that ROUGE 1.5.5 adds a token before or after the end of a sentence which can change the scores a little. Are your conclusions about your models still the same though?

yogesh-iitj commented 5 years ago

Ma'am, we are using ROUGE as a black box for evaluating summaries. We used DUC 2002 data-set (533 documents) to generate summaries from our implemented algorithm. These are the results we are getting from ROUGE 2 and ROUGE 1.5.5. From ROUGE 2: Avg_Recall Avg_Precision Avg_F-Score ROUGE-L 0.280386454 0.252365816 0.264839193 ROUGE-1 0.458032477 0.378717017 0.413726829 ROUGE-2 0.196267073 0.159891932 0.175813246

From ROUGE 1.5.5: Avg_Recall Avg_Precision Avg_F-Score ROUGE-L 0.436332474 0.359877839 0.393466 ROUGE-1 0.478281 0.394834 0.431531 ROUGE-2 0.200555 0.164635 0.180384 We are getting same observations for other data-sets too. So, either ROUGE 1.5.5 overestimating the results or ROUGE 2 underestimating the results. We are confused which version to use as we have to compare the result of our algorithm from the existing algorithm, and we don't know which version of ROUGE other authors used.

kavgan commented 5 years ago

Could you send me a small subset of your output and reference summaries. Also the results on that subset using both ROUGE 2/1.5.5. I'm going to look into it.

If you are looking at previous results from DUC its most probably generated using 1.5.5. (the perl version). You would have to confirm with the authors of papers you are comparing against/referencing.

yogesh-iitj commented 5 years ago

Ma'am, please find attached zip file.

On Wed, May 15, 2019 at 9:15 PM Kavita Ganesan notifications@github.com wrote:

Could you send me a small subset of your output and reference summaries. Also the results on that subset using both ROUGE 2/1.5.5. I'm going to look into it.

If you are looking at previous results from DUC its most probably generated using 1.5.5. (the perl version). You would have to confirm with the authors of papers you are comparing against/referencing.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kavgan/ROUGE-2.0/issues/18?email_source=notifications&email_token=AHOOGF7SX2WTOLO3G562M5LPVQV2DA5CNFSM4HM7ZCPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODVPCVGY#issuecomment-492710555, or mute the thread https://github.com/notifications/unsubscribe-auth/AHOOGFZO2N5224PKD3SNHXLPVQV2DANCNFSM4HM7ZCPA .