Closed sadaf0714 closed 3 months ago
To resolve the ValueError
and "Selector check failed" issues with TruLens and models from Hugging Face, follow these targeted steps:
Ensure Selector Targets Exist: Confirm that the data or method you're trying to select with a selector is present and accessible in the app or record.
Use Correct Selector Syntax: Follow the TruLens documentation for the correct syntax. Utilize predefined selectors like on_input_output()
for simplification.
Handle Empty Selectors Gracefully: Use the if_exists
parameter in your feedback definition to prevent errors when a selector fails to find its target.
Leverage Selector Shortcuts: Simplify feedback definitions with shortcuts such as on_input()
, on_output()
, ensuring precise selection.
Adopt Model-Specific Selectors: For specific frameworks, use the provided selectors that accurately extract parts of the model output.
Specify Behavior for Missing Parameters: Use the if_missing
parameter in feedback definitions to control behavior when a selector does not find its target.
Debug with Logging: Increase logging level for detailed output on selector operations and potential failures.
Utilize Huggingface Provider: For feedback functions calling Huggingface APIs, use the Huggingface
class for methods like context_relevance
.
Validate Inputs: Ensure inputs match expected types and constraints to avoid ValueError
.
For detailed guidance, refer to TruLens documentation on feedback schema, selector shortcuts, and the Huggingface provider. These resources provide comprehensive examples and best practices for effectively managing selectors and feedback functions with TruLens and Hugging Face models.
To continue the conversation, mention @dosu.
I have checked and all of this is fine. Can you please suggest why it i not extracting context relevance and groundedness metrics while i am getting answer relevance fine.
Any updates on this?
Hey @sadaf0714 - the issue here is that the context in your application call structure is not in the place you're defining it. Can you share more of your code so I can reproduce and help?
Sure here's the code: tru = Tru() tru.reset_database() os.environ["HUGGINGFACE_API_KEY"] = hf_token
provider = LiteLLM(model_engine="huggingface/mistralai/Mistral-7B-Instruct-v0.2")
f_qa_relevance = Feedback( provider.relevance_with_cot_reasons, name="Answer Relevance" ).on_input_output()
f_context_relevance = ( Feedback( provider.qs_relevance_with_cot_reasons, name="Context Relevance", ) .on(Select.RecordCalls.retrieve.args.query) .on(Select.RecordCalls.retrieve.rets.collect()) .aggregate(np.mean) ) grounded = Groundedness(groundedness_provider=provider)
f_groundedness = ( Feedback( grounded.groundedness_measure_with_cot_reasons, name="Groundedness", ) .on(Select.RecordCalls.retrieve.rets.collect()) .on_output() .aggregate(grounded.grounded_statements_aggregator) ) tru_recorder = TruChain( qa, app_id="App_1", feedbacks=[ f_qa_relevance, f_context_relevance, f_groundedness ] ) eval_ques = evaluationQuestions(ques_path) for question in eval_ques: with tru_recorder as recording: qa.run(question)
records, feedback = tru.get_records_and_feedback(app_ids=[])
metrices = records[["input", "output"] + feedback]
Please have a look. @joshreini1
@sadaf0714 please share the RAG setup as well so I can reproduce.
please find the full notebook attached. https://github.com/sadaf0714/trulens/blob/main/TruLens_langchain.ipynb
@sadaf0714 can you give me access to your repo?
https://github.com/sadaf0714/trulens @joshreini1 please try with this
Hi @sadaf0714 - thanks for sharing your notebook.
Please try updating your selectors as follows:
query = Select.Record.app.retriever._get_relevant_documents.args.query
context = Select.Record.app.retriever.get_relevant_documents.rets[:].page_content
f_context_relevance = (
Feedback(
provider.qs_relevance_with_cot_reasons,
name="Context Relevance",
)
.on(query)
.on(context)
.aggregate(np.mean)
)
grounded = Groundedness(groundedness_provider=provider)
# Define a groundedness feedback function
f_groundedness = (
Feedback(
grounded.groundedness_measure_with_cot_reasons,
name="Groundedness",
)
.on(context.collect())
.on_output()
.aggregate(grounded.grounded_statements_aggregator)
)
You can see where these selectors come from in the TruLens UI - see the screenshot below and let me know if you have questions on this. Thanks!
I am facing the same error. This is my code:
from trulens_eval import Feedback, Select from trulens_eval.feedback.provider.openai import AzureOpenAI
import numpy as np
provider = AzureOpenAI(deployment_name=azure_openai_chatgpt_deployment,api_version=azure_openai_api_version,azure_endpoint=azure_openai_endpoint,api_key=azure_openai_key)
f_groundedness = ( Feedback(provider.groundedness_measure_with_cot_reasons, name = "Groundedness") .on(Select.RecordCalls.retrieve.rets.collect()) .on_output() )
f_answer_relevance = ( Feedback(provider.relevance_with_cot_reasons, name = "Answer Relevance") .on_input() .on_output() )
f_context_relevance = ( Feedback(provider.context_relevance_with_cot_reasons, name = "Context Relevance") .on_input() .on(Select.RecordCalls.retrieve.rets[:]) .aggregate(np.mean) # choose a different aggregation method if you wish )
from trulens_eval import TruCustomApp tru_rag = TruCustomApp(query_engine, app_id = 'RAG v1', feedbacks = [f_groundedness, f_answer_relevance, f_context_relevance])
I am using the official documentation for reference which is this: https://www.trulens.org/trulens_eval/getting_started/quickstarts/quickstart/#set-up-feedback-functions
I would like to address an issue I am encountering while fetching context relevance and groundedness feedback metrics using TruLens. I am evaluating huggingface's "meta-llama/Llama-2-7b-chat-hf" (4 bit quantized version) model for RAG and LiteLLM(model_engine="huggingface/mistralai/Mistral-7B-Instruct-v0.1") for TruLens.
Below is the snapshot of error I am getting while running this piece of code: tru_recorder = TruChain( qa, app_id="App_1", feedbacks=[ f_qa_relevance, f_context_relevance, f_groundedness ] ) Error: ValueError: Some selectors do not exist in the app or record.