MLGroupJLU / LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".
https://arxiv.org/abs/2307.03109
1.38k stars 86 forks source link

Can you add our recent work to your survey? #25

Open grayground opened 10 months ago

grayground commented 10 months ago

Hi,

I have read your insightful paper and found it to be a valuable contribution to the field.

I would like to kindly suggest adding our recent work to your survey.

📄 Paper: Ask Again, Then Fail: Large Language Models' Vacillations in Judgement

This paper uncovers that the judgement consistency of LLM dramatically decreases when confronted with disruptions like questioning, negation, or misleading, even though its previous judgments were correct. It also explores several prompting methods to mitigate this issue and demonstrates their effectiveness.

Thank you for your consideration! :)

YuanWu3 commented 9 months ago

We appreciate your suggestion. It will be incorporated into our survey in the next version.