huggingface / datasets

🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
https://huggingface.co/docs/datasets
Apache License 2.0
19.24k stars 2.69k forks source link

❓ How to get ROUGE-2 with the ROUGE metric ? #216

Closed astariul closed 4 years ago

astariul commented 4 years ago

I'm trying to use ROUGE metric, but I don't know how to get the ROUGE-2 metric.


I compute scores with :

import nlp

rouge = nlp.load_metric('rouge')
with open("pred.txt") as p, open("ref.txt") as g:
    for lp, lg in zip(p, g):
        rouge.add([lp], [lg])
score = rouge.compute()

then : (print only the F-score for readability)

for k, s in score.items():
    print(k, s.mid.fmeasure)

It gives :

rouge1 0.7915168355671788 rougeL 0.7915168355671788


How can I get the ROUGE-2 score ?

Also, it's seems weird that ROUGE-1 and ROUGE-L scores are the same. Did I made a mistake ?

@lhoestq

lhoestq commented 4 years ago

ROUGE-1 and ROUGE-L shouldn't return the same thing. This is weird

lhoestq commented 4 years ago

For the rouge2 metric you can do

rouge = nlp.load_metric('rouge')
with open("pred.txt") as p, open("ref.txt") as g:
    for lp, lg in zip(p, g):
        rouge.add(lp, lg)
score = rouge.compute(rouge_types=["rouge2"])

Note that I just did a PR to have both .add and .add_batch for metrics, that's why now this is rouge.add(lp, lg) and not rouge.add([lp], [lg])

lhoestq commented 4 years ago

Well I just tested with the official script and both rouge1 and rougeL return exactly the same thing for the input you gave, so this is actually fine ^^

I hope it helped :)