bheinzerling / pyrouge

A Python wrapper for the ROUGE summarization evaluation package
MIT License
250 stars 71 forks source link

Pyrouge takes so much time to give the results #26

Open sajastu opened 5 years ago

sajastu commented 5 years ago

Hey, I run the package with the instructions on github/website, but it takes so much to give the results for Rouge scores (~20/25 minutes!) .

This is my code, which is actually a simple copy from the documentation:

from pyrouge import Rouge155

r = Rouge155()
r.system_dir = 'rad_decoder_sys/'
r.model_dir = 'rad_decoder_ref/'
r.system_filename_pattern = 'rademb-preds.(\d+).txt'
r.model_filename_pattern = 'rademb-gold.#ID#.txt'

output = r.convert_and_evaluate()
print(output)

When I try to run the script above, I end up with these logs:

2018-12-07 17:48:23,520 [MainThread ] [INFO ] Writing summaries. 2018-12-07 17:48:23,522 [MainThread ] [INFO ] Processing summaries. Saving system files to /var/folders/gz/4hxz5p9d235bxgbmb4phkk_00000gp/T/tmpre5wi1ih/system and model files to /var/folders/gz/4hxz5p9d235bxgbmb4phkk_00000gp/T/tmpre5wi1ih/model. 2018-12-07 17:48:23,522 [MainThread ] [INFO ] Processing files in rad_decoder_sys/. 2018-12-07 17:48:23,527 [MainThread ] [INFO ] Processing rademb-p.3614.txt. 2018-12-07 17:48:23,528 [MainThread ] [INFO ] Processing rademb-p.1003.txt. ... 2018-12-07 17:48:29,831 [MainThread ] [INFO ] Saved processed files to /var/folders/gz/4hxz5p9d235bxgbmb4phkk_00000gp/T/tmpre5wi1ih/model. 2018-12-07 17:48:57,667 [MainThread ] [INFO ] Written ROUGE configuration to /var/folders/gz/4hxz5p9d235bxgbmb4phkk_00000gp/T/tmp4z2m9wl7/rouge_conf.xml 2018-12-07 17:48:57,667 [MainThread ] [INFO ] Running ROUGE with command /Users/ss4164/PycharmProjects/rouge_calculator/official_rouge/RELEASE-1.5.5/ROUGE-1.5.5.pl -e /Users/ss4164/PycharmProjects/rouge_calculator/official_rouge/RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /var/folders/gz/4hxz5p9d235bxgbmb4phkk_00000gp/T/tmp4z2m9wl7/rouge_conf.xml

I tried with both PyCharm and terminal, but that didn't help.

Anyone having the same issue?

Thanks!

ecly commented 5 years ago

Not an issue afaik. For about 12k instances, I experience similar evaluation times. If you want something faster (albeit results aren't 100% identical), I recommend py-rouge