Luohh5 / Chain-of-Exemplar

Other
4 stars 0 forks source link

Question about the metrics #1

Open Gary-code opened 5 days ago

Gary-code commented 5 days ago

Congratulations on having your paper accepted to ACL 2024! I have a question regarding the evaluation metrics in your work. I noticed that the BLEU-4 scores reported for all models are quite high. I was curious to know which script or tool you used for evaluation.

Luohh5 commented 3 days ago

scoring.py has the details you want about the evaluation metrics😁

The script for BLEU-4 evaluation:

import evaluate bleu = evaluate.load('evaluate/metrics/bleu') bleu4 = bleu.compute(predictions=[hyp], references=[ref])['bleu']