bheinzerling / pyrouge

A Python wrapper for the ROUGE summarization evaluation package
MIT License
250 stars 71 forks source link

Changed input summary format to be SEE (Summary Evaluation Environment). #1

Closed fbarrios closed 9 years ago

fbarrios commented 9 years ago

Hi! I've been working with pyrouge and found out that the input format type in the configuration xml is always SPL. After a few days of work, I realized it should be SEE. The ROUGE package isn't well documented, but the verify.xml example file uses the SEE input type and the files referenced are formatted as the pyrouge output. The verify-spl.xml, on the other hand, uses the SPL type and references plain text files with split sentences.

As a result, the evaluation score is much higher than what should be. A similar issue is described here: http://metaoptimize.com/qa/questions/9725/rouge-evaluation-settings-for-document-summarization (currently offline, but here is a Google caché link: http://webcache.googleusercontent.com/search?q=cache:zKvxjALPipAJ:metaoptimize.com/qa/questions/9725/rouge-evaluation-settings-for-document-summarization+&cd=1&hl=es&ct=clnk&gl=ar).

What do you think? Regards from Argentina.

bheinzerling commented 9 years ago

Thanks for the PR, looks good.