A question about the evaluation of CrowS-Pairs

RUCAIBox / LLMSurvey

The official GitHub page for the survey paper "A Survey of Large Language Models".

9.64k stars 745 forks source link

Hello! I am a fresh man in the field of LLMs. I am reading your code and I have a question about the evaluation of CrowS-Pairs. In https://github.com/RUCAIBox/LLMSurvey/blob/4c324d19683901f0fc2c5eb46468baba390f1787/Experiments/HumanAlignment/metric/cal_crows_res.py#L18 why it is '<' instead of '>'? I think the model prefers a sentence with a smaller perplexity. The smaller is the perplexity, the more tendency have the model to output the sentence. So I think it's correct that acc = 1 when sent_more_ppl_score > sent_less_ppl_score. I don't know if I‘m right .Could you explain it to me? Thank you very much!

By the way, I am a prospective graduate student of RUC and I am going to enter Gaoling next year!

RUCAIBox / LLMSurvey

A question about the evaluation of CrowS-Pairs #67