nlpyang / geval

Code for paper "G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment"
MIT License
194 stars 22 forks source link

How is the "Auto CoT" prompt defined? #8

Open calvdee opened 2 days ago

calvdee commented 2 days ago

G-Eval includes "Auto Chain-of-Thoughts for NLG Evaluation" as a component where the CoT steps to carry out evaluation are produced by an LLM. The paper nor this repo, however, include the prompt definition. It would be convenient to have it available for both reproducibility and to extend G-Eval to other criteria.

calvdee commented 2 days ago

On second read, it looks like the evaluation steps are produced via the completions API which is now considered legacy by OpenAI. Using the completions API, I am still unable to reproduce the coherence evaluation steps outlined in this prompt using ether of the GPT 3.5 models:

image