truera / trulens

Evaluation and Tracking for LLM Experiments
https://www.trulens.org/
MIT License
2.05k stars 178 forks source link

[QUESTION][NEED HELP] trulens with chain.astream() #1487

Open tomdgr opened 2 hours ago

tomdgr commented 2 hours ago

Bug Description

When using the Answer Relevance COT feedback function(Groundedness and Context Relevance also fails) in a streaming setup with TruLens and Langchain's chain.astream(), the feedback evaluation crashes with a KeyError: 'invoke'. This seems to occur because the feedback_cot_function is not compatible with streamed, chunked responses. The function expects a full input and output, but since streaming delivers responses in chunks, the feedback function encounters issues.

NB! I have made this work by using chain.invoke(), however for my prodution environment I need to use astream(). I saw and old issue, which solved the problem with acall. However this function seems to be replaced by ainvoke. Link to the mentioned issue Does anyone here have experience with trulens + langchain + astream - function

The error I get when using astream:

   raise KeyError(
KeyError: 'Key not in dictionary: invoke'

To Reproduce

import logging
import asyncio
from typing import List, Dict
from langchain.memory import ConversationBufferMemory
import os

import numpy as np

from trulens.providers.openai import AzureOpenAI
from trulens.apps.langchain import TruChain
from trulens.core import Select, Feedback, TruSession

conversation = Conversation(
    title="Ny Chat",
    id="conversation-1234",
    filename="sample_file.txt",
    indexName="myindex",
    promptType="expert",
    messages=[
        Message(role="user", content="What is the legal precedent for this case?")
    ]
)
# Your existing AzureOpenAI provider setup
openai_provider = AzureOpenAI(
    deployment_name="gpt4o",
    endpoint=os.environ['AZURE_OPENAI_ENDPOINT'],
    api_version=AZURE_OPENAI_VERSION
)

tru = TruSession()
tru.reset_database()
tru.start_dashboard()

# ------------------------------------------------------------------
# Answer Relevance
# ------------------------------------------------------------------

def feedback_cot_function(input_data, output_data):

    cot_relevance = openai_provider.relevance_with_cot_reasons(input_data, output_data)

    return cot_relevance

f_feedback_cot = Feedback(feedback_cot_function, name = "Answer Relevance COT").on_input().on_output()

# ------------------------------------------------------------------
# Groundedness
# ------------------------------------------------------------------

def feedback_groundedness_function(context, response):
    groundedness = openai_provider.groundedness_measure_with_cot_reasons(context, response)
    return groundedness

f_groundedness = (
    Feedback(feedback_groundedness_function, name="Groundedness")
    .on(Select.Record.app.middle[0].rets)
    .on_output()
    .aggregate(np.mean)
)

# ------------------------------------------------------------------
# Context Relevance
# ------------------------------------------------------------------
f_context_relevance = (
    Feedback(openai_provider.context_relevance_with_cot_reasons, name="Context Relevance")
    .on_input()
    .on(Select.Record.app.middle[0].rets)
    .aggregate(np.mean)
)

# ------------------------------------------------------------------
# Main trigger
# ------------------------------------------------------------------

chain = await get_chain(conversation=conversation, prompt_type="RESPONSE")

tru_recorder = TruChain(
    app_name = "testapp",
    app=chain,
    provider=openai_provider,
    feedbacks = [f_groundedness,
                f_feedback_cot,
                f_context_relevance]
)

memory = ConversationBufferMemory(memory_key="history", return_messages=False)

for message in conversation.messages:
    if message.role == "user":
        memory.chat_memory.add_user_message(message.content)
    elif message.role == "assistant":
        memory.chat_memory.add_ai_message(message.content)

history = memory.load_memory_variables({})["history"]

input_data = {
    "conversation": str(conversation.messages[-1].content) if conversation.messages else "",  
    "history": history
}

# ------------------------------------------------------------------
# Handling Streaming with astream
# ------------------------------------------------------------------
bot_reply=""
with tru_recorder as recording:
    # Stream the response as it's generated
    async for chunk in chain.astream({"conversation": conversation, "history": history}):
        # Check if the chunk contains a response
        if "response" in chunk:
            content = chunk["response"]

            # Ensure content is an AIMessageChunk or compatible structure
            if isinstance(content, AIMessageChunk):
                # Collect the full bot reply as chunks come in
                bot_reply += content.content

                # Yield or return the chunk content (for example, for Server-Sent Events)
            else:
                # Raise an error if the chunk format is not as expected
                raise Exception("Unexpected response format, please check the chain or start a new conversation.")

async def get_chain(conversation: Conversation, prompt_type: str = "RESPONSE") -> AsyncIterator[str]:
    """
    Asynchronously creates a language model chain and streams the generated responses.

    Args:
        conversation (Conversation): The current conversation context containing messages and other details.
        prompt_type (str): The type of prompt to use, either "RESPONSE" or another supported prompt type. Defaults to "RESPONSE".

    Returns:
        RunnableSequence: The final chain object that can be used for streaming responses.
    """
    hyper_params = initialize_hyperparameters(
                )
    retriever = initialize_retriever(
                    hyper_params=hyper_params,
                    conversation=conversation
                )
    prompt = construct_prompt(
                    conversation=conversation,
                    prompt_type=prompt_type,
                    hyper_params=hyper_params
                )
    llm = initialize_llm(
                    hyper_params=hyper_params
                )

    # context chain
    context_chain = RunnableLambda(lambda x: retriever.get_relevant_documents(x["conversation"]))

    # query chain
    query_chain = RunnableLambda(lambda x: create_query(x["conversation"]))

    # main chain
    final_chain = (
        RunnableParallel(
            {
                "context": context_chain,
                "query": query_chain,
                "history": itemgetter("history"),
            }
        )
        .assign(response=prompt | llm)
        .pick(["response", "context"])
    )

    return final_chain

# retriever, llm, prompt and hyperparams are pretty standard and can be chosen arbitrarily.
# following is app.datastructures: 

from pydantic import BaseModel, Field
from typing import List, Optional

class FeedbackData(BaseModel):
    username: str
    feedback: str

class Message(BaseModel):
    role: str
    content: str

class Conversation(BaseModel):
    title: str
    id: str
    filename: str
    indexName: Optional[str] = None
    promptType: Optional[str] = None
    messages: List[Message]

Expected behavior A clear and concise description of what you expected to happen.

Relevant Logs/Tracebacks

Trace from dashboard when looking at evaluations: Minified React error #31; visit https://reactjs.org/docs/error-decoder.html?invariant=31&args[]=object%20with%20keys%20%7B__tru_non_serialized_object%7D for the full message or use the non-minified dev environment for full errors and additional helpful warnings. image

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
Dashboard already running at path:   Network URL: http://192.168.31.4:57998/

✅ In Answer Relevance COT, input input_data will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance COT, input output_data will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Groundedness, input context will be set to __record__.app.middle[0].invoke.rets .
✅ In Groundedness, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input question will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input context will be set to __record__.app.middle[0].invoke.rets .
Exception in thread Thread-28 (_future_target_wrapper):
Traceback (most recent call last):
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\feedback\feedback.py", line 1134, in _extract_selection
    arg_vals[k] = list(result)
                  ^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\serial.py", line 1046, in get
    for start_selection in start_items:
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\serial.py", line 1047, in get
    yield from last_step.get(start_selection)
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\serial.py", line 355, in get
    raise KeyError(
KeyError: 'Key not in dictionary: invoke'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 1045, in _bootstrap_inner
    self.run()
  File "C:\Users\user\AppData\Roaming\Python\Python311\site-packages\ipykernel\ipkernel.py", line 761, in run_closure
    _threading_Thread_run(self)
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\threading.py", line 982, in run
    self._target(*self._args, **self._kwargs)
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\python.py", line 561, in _future_target_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\app.py", line 632, in _manage_pending_feedback_results
    record.wait_for_feedback_results()
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\schema\record.py", line 205, in wait_for_feedback_results
    feedback_result = future_result.result(timeout=feedback_timeout)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\python.py", line 561, in _future_target_wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\threading.py", line 185, in _run_with_timeout
    raise e
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\utils\threading.py", line 170, in _run_with_timeout
    res: T = fut.result(timeout=timeout)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 456, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\schema\app.py", line 362, in run_and_call_callback
    temp = ffunc.run(app=app, record=record)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\feedback\feedback.py", line 852, in run
    raise e
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\feedback\feedback.py", line 833, in run
    input_combinations = list(
                         ^^^^^
  File "c:\Users\user\AppData\Local\Programs\Python\Python311\Lib\site-packages\trulens\core\feedback\feedback.py", line 1136, in _extract_selection
    raise InvalidSelector(
trulens.core.feedback.feedback.InvalidSelector: Selector __record__.app.middle[0].invoke.rets does not exist in source data.

Environment:

langchain 0.2.16 langchain-cli 0.0.31 langchain-community 0.2.17 langchain-core 0.2.40 langchain-openai 0.1.25 langchain-text-splitters 0.2.4

Additional context Here is the production environment code: running in fastAPI


"""
app/routes/chatbot.py
Maps to \AI-POC\src\components\Chatbot.vue

Has API calls which is relevant for communication with chatbot
"""

import azure.functions as func
from azurefunctions.extensions.http.fastapi import JSONResponse, StreamingResponse, Request
from fastapi import HTTPException
from fastapi import APIRouter, HTTPException, Query, Body
#from fastapi.responses import StreamingResponse
from ..datastructures import Conversation
from ..utils.azure import upload_message_to_blob
from ..utils.langchain_funcs import get_chain, generate_conversation_name_from_messages_no_streaming
from tenacity import (retry, retry_if_exception_type, stop_after_attempt,
                      wait_fixed)
from azure.core.exceptions import ResourceModifiedError
from langchain.memory import ConversationBufferMemory
from ..config import AZURE_OPENAI_ENDPOINT,AZURE_OPENAI_VERSION
import numpy as np
import datetime
import json
from langchain_core.messages.ai import AIMessageChunk
import asyncio

from trulens.providers.openai import AzureOpenAI
from trulens.apps.langchain import TruChain
from trulens.core import Select, Feedback, TruSession

chatbot_bp = func.Blueprint()

prefix = "/chatbot"
openai_provider = AzureOpenAI(
    deployment_name="gpt4o",
    endpoint=AZURE_OPENAI_ENDPOINT,
    api_version=AZURE_OPENAI_VERSION
)

tru_session = TruSession()

def generate_trulens_app_id():

    current_time = datetime.datetime.now()
    return f"RAG_Chain_pytest_{current_time.strftime('%Y%m%d_%H%M%S')}"

async def yield_chain_response_with_metrics(conversation, history, chain, tru_recorder):
    """
    Wrapper for streaming response with structured output and separate feedback metrics.
    """
    loop = asyncio.get_running_loop()
    original_task_factory = loop.get_task_factory()
    loop.set_task_factory(None)
    latest_message = conversation.messages[-1].content if conversation.messages else ""
    input_data = {
        "conversation": latest_message,
        "history": history
    }
    try:
        bot_reply = ""
        # Begin recording with Trulens
        with tru_recorder as recording:
            # Stream the response as it's generated
            async for chunk in chain.astream({"conversation": conversation, "history": history}):
                if "response" in chunk:
                    content = chunk["response"]

                    # Ensure content is an AIMessageChunk or compatible structure
                    if isinstance(content, AIMessageChunk):
                        bot_reply += content.content  # Collect full bot reply
                        yield content.content  # SSE formatted response chunk
                    else:
                        raise Exception("Something went wrong in yield chunk in chain response, try opening a new chat")

            # Finish recording and gather metrics after streaming is done
            # Finish recording and gather metrics after streaming is done
            df, feedback_columns_list = tru_session.get_records_and_feedback()

            # Extract the metrics you need (adjust column names as per your DataFrame)
            metrics_columns = ['Answer Relevance COT', 'Groundedness', 'Context Relevance']
            metrics_dict = df[metrics_columns].iloc[0].to_dict()

            # Send the metrics to the frontend
            metrics_json = json.dumps({'metrics': metrics_dict})
            print(metrics_json)
            yield f"{metrics_json}\n\n"

    except Exception as e:
        print("An Unexpected error happened: ", e)
    finally:
        loop.set_task_factory(original_task_factory)

# Frontendmapping: SendMessage()
@chatbot_bp.route(route = f"{prefix}/query_openai_stream", methods = [func.HttpMethod.POST])
async def query_openai_stream_endpoint(req: Request) -> StreamingResponse:
    try:
        # ----------------------------------------------------------
        # Feedbackfunctions used in TruLens
        # ----------------------------------------------------------

        def feedback_cot_function(input_data, output_data):
            cot_relevance = openai_provider.relevance_with_cot_reasons(input_data, output_data)
            return cot_relevance

        f_feedback_cot = Feedback(feedback_cot_function, name="Answer Relevance COT").on_input().on_output()

        def feedback_groundedness_function(context, response):
            groundedness = openai_provider.groundedness_measure_with_cot_reasons(context, response)
            return groundedness

        f_groundedness = (
            Feedback(feedback_groundedness_function, name="Groundedness")
            .on(Select.Record.app.middle[0].invoke.rets)
            .on_output()
            .aggregate(np.mean)
        )

        def feedback_context_relevance_function(question, context):
            context_relevance = openai_provider.context_relevance_with_cot_reasons(question, context)
            return context_relevance

        f_context_relevance = (
            Feedback(feedback_context_relevance_function, name="Context Relevance")
            .on_input()
            .on(Select.Record.app.middle[0].invoke.rets)
            .aggregate(np.mean)
        )

        # Parse the JSON body into a dictionary
        conversation_data = await req.json()
        # Create a Conversation instance from the parsed data
        conversation = Conversation(**conversation_data)

        # Temporary working history:
        memory = ConversationBufferMemory(memory_key="history", return_messages=False)
        for message in conversation.messages: # Populate memory with past conversation messages
            if message.role == "user":
                memory.chat_memory.add_user_message(message.content)
            elif message.role == "assistant":
                memory.chat_memory.add_ai_message(message.content)

        history = memory.load_memory_variables({})["history"]
        app_id_generated = generate_trulens_app_id()
        chain = await get_chain(conversation=conversation,prompt_type="RESPONSE")
        tru_recorder = TruChain(
            app=chain,
            provider=openai_provider,
            app_id=app_id_generated,
            feedbacks=[f_groundedness, f_feedback_cot, f_context_relevance]
        )
        # Return and wrap chain with inputs in StreamingResponse for frontend 
        return StreamingResponse(yield_chain_response_with_metrics(conversation, history, chain, tru_recorder), media_type="text/event-stream")
    except Exception as e:
        print(e,"ERROR IN QOSE")
        raise HTTPException(status_code=500, detail=str(e))

The part involving df, and feedbacks in yield_chain_response_with_metric: is what I want to solve

       df, feedback_columns_list = tru_session.get_records_and_feedback()

            # Extract the metrics you need (adjust column names as per your DataFrame)
            metrics_columns = ['Answer Relevance COT', 'Groundedness', 'Context Relevance']
            metrics_dict = df[metrics_columns].iloc[0].to_dict()

            # Send the metrics to the frontend
            metrics_json = json.dumps({'metrics': metrics_dict})
            print(metrics_json)
            yield f"{metrics_json}\n\n"
dosubot[bot] commented 2 hours ago

To resolve the KeyError: 'invoke' when using the Answer Relevance COT feedback function in a streaming setup with TruLens and Langchain's chain.astream() function, you need to ensure that the invoke method is correctly referenced and used. Here is a corrected version of your code snippet that should help resolve the issue:

import logging
import asyncio
from typing import List, Dict
from langchain.memory import ConversationBufferMemory
import os
import numpy as np
from trulens.providers.openai import AzureOpenAI
from trulens.apps.langchain import TruChain
from trulens.core import Select, Feedback, TruSession
from pydantic import BaseModel, Field
from typing import List, Optional
from langchain_core.messages.ai import AIMessageChunk

class FeedbackData(BaseModel):
    username: str
    feedback: str

class Message(BaseModel):
    role: str
    content: str

class Conversation(BaseModel):
    title: str
    id: str
    filename: str
    indexName: Optional[str] = None
    promptType: Optional[str] = None
    messages: List[Message]

conversation = Conversation(
    title="Ny Samtale",
    id="conversation-1234",
    filename="sample_file.txt",
    indexName="livsvitenskap",
    promptType="juridisk ekspert",
    messages=[
        Message(role="user", content="What is the legal precedent for this case?")
    ]
)

openai_provider = AzureOpenAI(
    deployment_name="gpt4o",
    endpoint=os.environ['AZURE_OPENAI_ENDPOINT'],
    api_version=AZURE_OPENAI_VERSION
)

tru = TruSession()
tru.reset_database()
tru.start_dashboard()

def feedback_cot_function(input_data, output_data):
    cot_relevance = openai_provider.relevance_with_cot_reasons(input_data, output_data)
    return cot_relevance

f_feedback_cot = Feedback(feedback_cot_function, name="Answer Relevance COT").on_input().on_output()

def feedback_groundedness_function(context, response):
    groundedness = openai_provider.groundedness_measure_with_cot_reasons(context, response)
    return groundedness

f_groundedness = (
    Feedback(feedback_groundedness_function, name="Groundedness")
    .on(Select.Record.app.middle[0].rets)
    .on_output()
    .aggregate(np.mean)
)

f_context_relevance = (
    Feedback(openai_provider.context_relevance_with_cot_reasons, name="Context Relevance")
    .on_input()
    .on(Select.Record.app.middle[0].rets)
    .aggregate(np.mean)
)

async def get_chain(conversation: Conversation, prompt_type: str = "RESPONSE") -> AsyncIterator[str]:
    hyper_params = initialize_hyperparameters()
    retriever = initialize_retriever(hyper_params=hyper_params, conversation=conversation)
    prompt = construct_prompt(conversation=conversation, prompt_type=prompt_type, hyper_params=hyper_params)
    llm = initialize_llm(hyper_params=hyper_params)

    context_chain = RunnableLambda(lambda x: retriever.get_relevant_documents(x["conversation"]))
    query_chain = RunnableLambda(lambda x: create_query(x["conversation"]))

    final_chain = (
        RunnableParallel(
            {
                "context": context_chain,
                "query": query_chain,
                "history": itemgetter("history"),
            }
        )
        .assign(response=prompt | llm)
        .pick(["response", "context"])
    )

    return final_chain

chain = await get_chain(conversation=conversation, prompt_type="RESPONSE")

tru_recorder = TruChain(
    app_name="testapp",
    app=chain,
    provider=openai_provider,
    feedbacks=[f_groundedness, f_feedback_cot, f_context_relevance]
)

memory = ConversationBufferMemory(memory_key="history", return_messages=False)

for message in conversation.messages:
    if message.role == "user":
        memory.chat_memory.add_user_message(message.content)
    elif message.role == "assistant":
        memory.chat_memory.add_ai_message(message.content)

history = memory.load_memory_variables({})["history"]

input_data = {
    "conversation": str(conversation.messages[-1].content) if conversation.messages else "",  
    "history": history
}

bot_reply = ""
with tru_recorder as recording:
    async for chunk in chain.astream({"conversation": conversation, "history": history}):
        if "response" in chunk:
            content = chunk["response"]
            if isinstance(content, AIMessageChunk):
                bot_reply += content.content
            else:
                raise Exception("Unexpected response format, please check the chain or start a new conversation.")

async def yield_chain_response_with_metrics(conversation, history, chain, tru_recorder):
    loop = asyncio.get_running_loop()
    original_task_factory = loop.get_task_factory()
    loop.set_task_factory(None)
    latest_message = conversation.messages[-1].content if conversation.messages else ""
    input_data = {
        "conversation": latest_message,
        "history": history
    }
    try:
        bot_reply = ""
        with tru_recorder as recording:
            async for chunk in chain.astream({"conversation": conversation, "history": history}):
                if "response" in chunk:
                    content = chunk["response"]
                    if isinstance(content, AIMessageChunk):
                        bot_reply += content.content
                        yield content.content
                    else:
                        raise Exception("Something went wrong in yield chunk in chain response, try opening a new chat")

            df, feedback_columns_list = tru.get_records_and_feedback(app_ids=["testapp"])
            metrics_columns = ['Answer Relevance COT', 'Groundedness', 'Context Relevance']
            metrics_dict = df[metrics_columns].iloc[0].to_dict()
            metrics_json = json.dumps({'metrics': metrics_dict})
            print(metrics_json)
            yield f"{metrics_json}\n\n"
    except Exception as e:
        print("An Unexpected error happened: ", e)
    finally:
        loop.set_task_factory(original_task_factory)

@chatbot_bp.route(route=f"{prefix}/query_openai_stream", methods=[func.HttpMethod.POST])
async def query_openai_stream_endpoint(req: Request) -> StreamingResponse:
    try:
        conversation_data = await req.json()
        conversation = Conversation(**conversation_data)

        memory = ConversationBufferMemory(memory_key="history", return_messages=False)
        for message in conversation.messages:
            if message.role == "user":
                memory.chat_memory.add_user_message(message.content)
            elif message.role == "assistant":
                memory.chat_memory.add_ai_message(message.content)

        history = memory.load_memory_variables({})["history"]
        app_id_generated = generate_trulens_app_id()
        chain = await get_chain(conversation=conversation, prompt_type="RESPONSE")
        tru_recorder = TruChain(
            app=chain,
            provider=openai_provider,
            app_id=app_id_generated,
            feedbacks=[f_groundedness, f_feedback_cot, f_context_relevance]
        )
        return StreamingResponse(yield_chain_response_with_metrics(conversation, history, chain, tru_recorder), media_type="text/event-stream")
    except Exception as e:
        print(e, "ERROR IN QOSE")
        raise HTTPException(status_code=500, detail=str(e))

This code uses chain.astream() for the streaming setup and ensures that the invoke method is correctly referenced and used within the streaming context. Additionally, it includes the correct feedback function relevance for answer relevance, as shown in the LangChain Quickstart example [1][2][3].

To continue talking to Dosu, mention @dosu.

tomdgr commented 1 hour ago

Dosu's suggestion is tried before, but does not work either, would appreciate human help.

sfc-gh-jreini commented 1 hour ago

Hi @tomdgr - thanks for the question. We have a fixes coming for async and streaming in this PR.

Feel free to install it from the branch if you'd like to try it now, otherwise we'll have it in the next release.