evaluation prompt : ABCDE的打分法来评价输出的prompt

选自 https://www.youtube.com/watch?v=fJ1PemlSVtY&list=PLiuLMb-dLdWKjX8ib9PhlCIx1jKMNxMpy&index=10

Openai的eval prompt: https://github.com/openai/evals/blob/main/evals/registry/modelgraded/fact.yaml

截图

system_message = """\

You are an assistant that evaluates how well the custom answers a user question by comparing the response to th

Output a single letter and nothing else.

user_message = f"""

You are comparing a submitted answer to an expert answer on

[BEGIN DATA]

*****

[Question]: {cust_msg}

[Expert]: {ideal}

*****

[Submission]: {completion}
*****

[END DATA]

Compare the factual content of the submitted answer with th

The submitted answer may either be a subset or superset

(A) The submitted answer is a subset of the expert answ

(B) The submitted answer is a superset of the expert an

(C) The submitted answer contains all the same details

(D) There is a disagreement between the submitted answe

(E) The answers differ, but these differences don't mat

choice_strings: ABCDE

huan9huan / prompts

evaluation prompt : ABCDE的打分法来评价输出的prompt #41

选自 https://www.youtube.com/watch?v=fJ1PemlSVtY&list=PLiuLMb-dLdWKjX8ib9PhlCIx1jKMNxMpy&index=10

截图