issues
search
Y-IAB
/
lm-evaluation-harness
A framework for few-shot evaluation of language models.
https://www.eleuther.ai
MIT License
0
stars
0
forks
source link
Add Bartscore metric for summarization and fix LLM eval
#8
Closed
myeongho-jeong-yanolja
closed
6 months ago
myeongho-jeong-yanolja
commented
6 months ago
Add Bartscore metric - this metrics calculates generation probability, and use (-1 x loss) as scores.
For summarization scenario, I calculate score between source text and summary text, so named as BARTScore-src
For uptrain>0.5.0, there is an issue for evaulating korean languages, so I roll back to 0.5.0 and refine score calculation according to its output.