kavgan / ROUGE-2.0

ROUGE automatic summarization evaluation toolkit. Support for ROUGE-[N, L, S, SU], stemming and stopwords in different languages, unicode text evaluation, CSV output.
https://kavgan.github.io/ROUGE-2.0
Apache License 2.0
210 stars 37 forks source link

Incorrect LCS implementation #22

Closed FilipStefaniuk closed 4 years ago

FilipStefaniuk commented 4 years ago

LCS algorithm is implemented incorrectly (hence the large rouge-L differences in #18 ). Function:

https://github.com/kavgan/ROUGE-2.0/blob/26092bd65f2cbf5e7ffbe2f23740bb95f819b063/src/com/rxnlp/tools/rouge/ROUGECalculator.java#L907 should not search for the longest common subsequence in a greedy way. The following example should have rouge-L equal to 1.0:

f.stefaniuk@AMDC3754:~/Documents/other/rouge2.0$ cat ./projects/test-lcs/reference/task1_ref1.txt 
token0 xxxx token1 token2 token3 token4 xxxx token5
f.stefaniuk@AMDC3754:~/Documents/other/rouge2.0$ diff ./projects/test-lcs/reference/task1_ref1.txt ./projects/test-lcs/system/task1_system1.txt 
f.stefaniuk@AMDC3754:~/Documents/other/rouge2.0$ java -jar ./target/rouge-calculator-1.2.1-shaded.jar 

========Results Summary=======

ROUGE-L+StopWordRemoval TASK1   SYSTEM1.TXT Average_R:0.66667   Average_P:0.66667   Average_F:0.66667   Num Reference Summaries:1

======Results Summary End======

The correct results (using ROUGE-1.5.5 with pyrouge):

pyrouge_evaluate_plain_text_files -s ./projects/test-lcs/system/ -sfp "task(\d+)_system1.txt" -m ./projects/test-lcs/reference/ -mfp task#ID#_ref1.txt
2019-11-27 19:38:56,486 [MainThread  ] [INFO ]  Writing summaries.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Processing summaries. Saving system files to /tmp/tmpmalj10/system and model files to /tmp/tmpmalj10/model.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Processing files in ./projects/test-lcs/system/.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Processing task1_system1.txt.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmpmalj10/system.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Processing files in ./projects/test-lcs/reference/.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Processing task1_ref1.txt.
2019-11-27 19:38:56,488 [MainThread  ] [INFO ]  Saved processed files to /tmp/tmpmalj10/model.
2019-11-27 19:38:56,489 [MainThread  ] [INFO ]  Written ROUGE configuration to /tmp/tmp7pU4tY/rouge_conf.xml
2019-11-27 19:38:56,489 [MainThread  ] [INFO ]  Running ROUGE with command /usr/local/lib/RELEASE-1.5.5/ROUGE-1.5.5.pl -e /usr/local/lib/RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /tmp/tmp7pU4tY/rouge_conf.xml
---------------------------------------------
1 ROUGE-1 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-1 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-1 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-2 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-2 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-2 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-3 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-3 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-3 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-4 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-4 Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-4 Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
1 ROUGE-L Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-L Average_P: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
1 ROUGE-L Average_F: 1.00000 (95%-conf.int. 1.00000 - 1.00000)
---------------------------------------------
kavgan commented 4 years ago

Please see fixed lcs in #23