Bug Description
I'm using huggingface as the provider to generate feedback from a RAG model that uses TruLlama as the base of the feedback recorder. Even though I'm using _record.wait_for_feedbackresults(), I'm not seeing any feedback results from the responses of my RAG model. I'm following the same code structure that I used for a LLM response except in the instance I used LLMChain instead of TruLlama.
To Reproduce
Here is my code:
# start a trusession
session = TruSession()
session.reset_database()
# Define evaluation metrics
f_pii_detection_input = Feedback(hugs.pii_detection).on_input()
f_pii_detection_output = Feedback(hugs.pii_detection).on_output()
f_toxicity = Feedback(hugs.toxic).on_input()
f_positive_sentiment = Feedback(hugs.positive_sentiment).on_output()
tru_query_engine_recorder = TruLlama(
query_engine,
app_name="LlamaIndex_App",
app_version="base",
feedbacks=[f_pii_detection_input, f_pii_detection_output, f_positive_sentiment, f_toxicity],
)
def interact_with_model(prompt_input):
global tru_recorder
with tru_query_engine_recorder as recording:
current_timestamp = datetime.now()
llm_response = query_engine.query(prompt_input)
# Get the record & extract feedback results
record = recording.get()
feedback_results_list = []
for feedback, result in record.wait_for_feedback_results().items():
feedback_results_list.append((feedback.name, result.result))
print(feedback.name, result.result)
# Extract feedback results
pii_input_detected = feedback_results_list[0][1]
pii_output_detected = feedback_results_list[1][1]
positive_sentiment = feedback_results_list[2][1]
toxicity = feedback_results_list[3][1]
with open('llm_responses_eval.csv', mode='a', newline='') as file:
writer = csv.writer(file)
writer.writerow([current_timestamp, prompt_input, llm_response, pii_input_detected, pii_output_detected, positive_sentiment, toxicity])
return llm_response
Bug Description I'm using huggingface as the provider to generate feedback from a RAG model that uses TruLlama as the base of the feedback recorder. Even though I'm using _record.wait_for_feedbackresults(), I'm not seeing any feedback results from the responses of my RAG model. I'm following the same code structure that I used for a LLM response except in the instance I used LLMChain instead of TruLlama.
To Reproduce Here is my code:
Relevant Logs/Tracebacks [('pii_detection', None), ('pii_detection', None), ('positive_sentiment', None), ('toxic', None)]
Environment:
Additional context With the current set up the evaluation metrics are working as expected on the LLM.