neulab / BARTScore

BARTScore: Evaluating Generated Text as Text Generation
Apache License 2.0
318 stars 37 forks source link

Compute src hypo with rouge and bert #19

Closed JohnGiorgi closed 2 years ago

JohnGiorgi commented 2 years ago

This PR updates the score.py script of the SUM module so that both bert_score and rouge provide scores when the source (src) documents are given as the target and the machine generated summaries are given as the predictions. These new scores are tagged as *_src_hyp and *_hypo_ref, following the conventions in the original BARTScore repo.

One interesting note: bert_score is the same regardless of the order of inputs (e.g. (src, hypo) <==> (hypo, src)) but rouge is not.