[ISSUE] record.wait_for_feedback_results() with TruLlama not recording results

Bug Description I'm using huggingface as the provider to generate feedback from a RAG model that uses TruLlama as the base of the feedback recorder. Even though I'm using _record.wait_for_feedbackresults(), I'm not seeing any feedback results from the responses of my RAG model. I'm following the same code structure that I used for a LLM response except in the instance I used LLMChain instead of TruLlama.

To Reproduce Here is my code:

# start a trusession
session = TruSession()
session.reset_database()

# Define evaluation metrics
f_pii_detection_input = Feedback(hugs.pii_detection).on_input()
f_pii_detection_output = Feedback(hugs.pii_detection).on_output()
f_toxicity = Feedback(hugs.toxic).on_input()
f_positive_sentiment = Feedback(hugs.positive_sentiment).on_output()

tru_query_engine_recorder = TruLlama(
    query_engine,
    app_name="LlamaIndex_App",
    app_version="base",
    feedbacks=[f_pii_detection_input, f_pii_detection_output, f_positive_sentiment, f_toxicity],
)

def interact_with_model(prompt_input):
    global tru_recorder

    with tru_query_engine_recorder as recording:
        current_timestamp = datetime.now()
        llm_response = query_engine.query(prompt_input)

        # Get the record &  extract feedback results
        record = recording.get()

        feedback_results_list = [] 

        for feedback, result in record.wait_for_feedback_results().items():
            feedback_results_list.append((feedback.name, result.result))
            print(feedback.name, result.result)

        # Extract feedback results

        pii_input_detected = feedback_results_list[0][1]
        pii_output_detected = feedback_results_list[1][1]
        positive_sentiment = feedback_results_list[2][1]
        toxicity = feedback_results_list[3][1]

        with open('llm_responses_eval.csv', mode='a', newline='') as file:
            writer = csv.writer(file)
            writer.writerow([current_timestamp, prompt_input, llm_response, pii_input_detected, pii_output_detected, positive_sentiment, toxicity])

    return llm_response

Relevant Logs/Tracebacks [('pii_detection', None), ('pii_detection', None), ('positive_sentiment', None), ('toxic', None)]

Environment:

Windows

Python 3.10

trulens                                   1.1.0
trulens-apps-langchain                    1.0.3
trulens-apps-llamaindex                   1.2.6
trulens-core                              1.2.6
trulens-dashboard                         1.0.2
trulens-feedback                          1.0.2
trulens-providers-huggingface             1.2.6
trulens-providers-litellm                 1.0.2

Additional context With the current set up the evaluation metrics are working as expected on the LLM.

truera / trulens

[ISSUE] record.wait_for_feedback_results() with TruLlama not recording results #1638