tianyi-lab / Cherry_LLM

[NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other models
287 stars 19 forks source link

关于Direct Answer Score sθ(A) #17

Closed DryPilgrim closed 9 months ago

DryPilgrim commented 9 months ago

想请教下面这个问题,非常感谢您的回答:)

为什么das越高,对模型越有挑战呢?das越高不是表明模型预测的概率越大,掌握得越好吗?(from paper: A higher direct answer score may suggest that the answer is inherently more challenging or intricate for the model to generate.)

我理解的das的自回归计算过程:

对于数据:{"instruction": "what do you like to eat?", "answer": "I like eating apples."}
das要衡量模型对answer本身的生成难度,das的自回归计算方式为:
i
i like 
I like eating
I like eating apples
MingLiiii commented 9 months ago

Thank you so much for your interest in our work! We are sorry for your misunderstanding, there should be minus signs in the DAS and CAS equations. With the minus sign, the logic is the same as loss or perplexity. Sorry for the typos, we can not modify our manuscript yet due to the anonymous period 😂😂

Please refer to #7 and #4.

DryPilgrim commented 9 months ago

tks for your reply :-)