Instruction-Tuning-with-GPT-4 / GPT-4-LLM

Instruction Tuning with GPT-4
https://instruction-tuning-with-gpt-4.github.io/
Apache License 2.0
4.22k stars 301 forks source link

Question about the Reward Model #8

Closed liuyeah closed 1 year ago

liuyeah commented 1 year ago

In this paper, you say "For each instance of the comparison data involving one prompt x and K responses, GPT-4 assigns a score s ∈ [1, 10] for each response."

However, in this repository, I cannot find the prompt that asking GPT-4 to conduct the scoring process. Could you release the designs of such prompt? This is important for researchers to follow your work.

pengbaolin commented 1 year ago

The prompt was omit in the paper. We will release more materials later. But please find the prompts we used as follows:

Instruction

{instruction}

Response

{Response}

End of Response

We would like to request your feedback on the performance of AI assistant in response to the user question displayed above. Please rate the helpfulness, relevance, accuracy, level of details of their responses. Each assistant receives an overall score on a scale of 1 to 10, where a higher score indicates better overall performance. Please first output a single line containing value indicating the scores. In the subsequent line, please provide a comprehensive explanation of your evaluation, avoiding any potential bias.