Closed loulblemo closed 3 months ago
To ensure that the response "I don't know" does not receive a high relevance score when the prompt clearly states the favorite color, you need to adjust the evaluation criteria in the PromptResponseRelevance
class. Specifically, modify the guideline that currently states "Answers that intentionally do not answer the question, such as 'I don't know', should also be counted as the most RELEVANT."
Here is the modified PromptResponseRelevance
class:
class PromptResponseRelevance(Relevance, WithPrompt):
system_prompt: ClassVar[PromptTemplate] = PromptTemplate.from_template(
"""You are a RELEVANCE grader; providing the relevance of the given RESPONSE to the given PROMPT.
Respond only as a number from 0 to 10 where 0 is the least relevant and 10 is the most relevant.
A few additional scoring guidelines:
- Long RESPONSES should score equally well as short RESPONSES.
- Answers that intentionally do not answer the question, such as 'I don't know' and model refusals, should be counted as the least RELEVANT.
- RESPONSE must be relevant to the entire PROMPT to get a score of 10.
- RELEVANCE score should increase as the RESPONSE provides RELEVANT context to more parts of the PROMPT.
- RESPONSE that is RELEVANT to none of the PROMPT should get a score of 0.
- RESPONSE that is RELEVANT to some of the PROMPT should get as score of 2, 3, or 4. Higher score indicates more RELEVANCE.
- RESPONSE that is RELEVANT to most of the PROMPT should get a score between a 5, 6, 7 or 8. Higher score indicates more RELEVANCE.
- RESPONSE that is RELEVANT to the entire PROMPT should get a score of 9 or 10.
- RESPONSE that is RELEVANT and answers the entire PROMPT completely should get a score of 10.
- RESPONSE that confidently FALSE should get a score of 0.
- RESPONSE that is only seemingly RELEVANT should get a score of 0.
- Never elaborate.
"""
)
user_prompt: ClassVar[PromptTemplate] = PromptTemplate.from_template(
"""PROMPT: {prompt}
RESPONSE: {response}
RELEVANCE: """
)
This change ensures that responses like "I don't know" will receive a low relevance score when they do not address the prompt appropriately [1][2].
To continue talking to Dosu, mention @dosu.
This is my interpretation - as it stands today, relevance_with_cot_reasons
mainly assesses how relevant an answer is to user's query, so a model answering they don't know their favorite color can (or maybe even should) actually be considered perfectly relevant, hence a 1.0 score.
Let me know if this makes sense.
@loulblemo please check out custom feedback functions if you'd like different handling for your case.
For example, you could create the custom feedback as follows to replace the handling of "I don't know responses" to be scored as least relevant.
from trulens_eval.feedback.provider.openai import OpenAI
from typing import Tuple, Dict
from trulens_eval.feedback import prompts
class Custom_OpenAI(OpenAI):
def context_relevance_with_cot_reasons_extreme(self, question: str, context: str) -> Tuple[float, Dict]:
"""
Tweaked version of answer relevance, extending OpenAI provider.
A function that completes a template to check the relevance of the answer to the question.
Updated to score "I don't know" and similar responses to have low relevance.
Also uses chain of thought methodology and emits the reasons.
Args:
prompt (str): A prompt.
response (str): A response to the prompt.
Returns:
float: A value between 0 and 1. 0 being "not relevant" and 1 being "relevant".
"""
# Update handling of I don't know
system_prompt = prompts.ANSWER_RELEVANCE_SYSTEM.replace(
"Answers that intentionally do not answer the question, such as 'I don't know', should also be counted as the most RELEVANT.", ""Answers that intentionally do not answer the question, such as 'I don't know', should also be counted as the least RELEVANT."")
user_prompt = str.format(prompts.ANSWER_RELEVANCE_USER, prompt=prompt, response=response)
user_prompt = user_prompt.replace(
"RELEVANCE:", prompts.COT_REASONS_TEMPLATE
)
return self.generate_score_and_reasons(system_prompt, user_prompt)
Discussed in https://github.com/truera/trulens/discussions/1182