Just ran the run_summarization.py script, with the parameters specified here and the ROGUE scores are far off from what is reported in the related paper.
The ROGUE scores reported in PreSumm paper (R1, R2, RL):
BertSumExtAbs | 42.13 | 19.60 | 39.18
The ROGUE scores after running the HF script:
ROGUE 1:
F1 = .275
Precision = .299
Recall = .260
ROGUE 2:
F1 = .161
Precision = .184
Recall = .149
ROGUE L:
F1 = .305
Precision = .326
Recall = .290
The README file seems to suggest that running the script as is, with all the stories in a single directory, will give you ROGUE scores similar to that of the paper. That doesn't seem the case.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
❓ Questions & Help
Just ran the
run_summarization.py
script, with the parameters specified here and the ROGUE scores are far off from what is reported in the related paper.The ROGUE scores reported in PreSumm paper (R1, R2, RL):
The ROGUE scores after running the HF script:
The README file seems to suggest that running the script as is, with all the stories in a single directory, will give you ROGUE scores similar to that of the paper. That doesn't seem the case.
Any ideas why? Or what I may be doing wrong here?
Thanks much!
FYI ... ran the script as in the README: