request for more details

krystalan / chatgpt_as_nlg_evaluator

Technical Report: Is ChatGPT a Good NLG Evaluator? A Preliminary Study

41 stars 1 forks source link

Hi, Sorry for the really late reply.

The OpenAI ChatGPT did not release the official API when we did the experiments. Thus, there might be gaps when you reproduce the results using the official API.

Currently, I recommend setting the temperature to zero in official APIs and using the gpt-3.5-turbo model. Empirically, I find that when setting the temperature to zero, the gpt-3.5-turbo model will directly produce the final scores without any explanations. If you want to collect explanations, try to raise the temperature.

Please feel free to drop me emails for any other questions.

krystalan / chatgpt_as_nlg_evaluator

request for more details #1