Chainlit / chainlit

Build Conversational AI in minutes ⚡️
https://docs.chainlit.io
Apache License 2.0
7.13k stars 935 forks source link

Streaming nested LLMChain instantly instead of waiting for final answer #377

Closed kay-es closed 1 month ago

kay-es commented 1 year ago

Hey folks,

Im currently implementing with langchain and chainlit. Its an agent with tools set return_direct=True.

The streaming feature is activated and seems to work:

image

My problem is now that I dont want the agent to stream the decision of which tool to choose but the generating output of the tool itself which is inside nested LLM chains:

image

Is it possible to stream the output of the tools directly instead of waiting for it passing back to agent? the "return_direct" attribute is anyways set to true.

My current implementation of on_message looks like this:

@cl.on_message
async def main(message: str):
    global agent
    global chain
    cb = cl.AsyncLangchainCallbackHandler(
        stream_final_answer=True,
        stream_prefix=False,
    )
    cb.answer_reached = True
    res = await agent.arun(message, callbacks=[cb])
    #res = await chain.arun(message, callbacks=[cb])
    await cl.Message(content=res).send()

calling the chain directly delivers me the results how I would need it for the tool/chain but not for the agent as shown in the screenshots. Any thoughts or ideas? 🙂

willydouhard commented 1 year ago

This one is a bit tricky. Langchain final answer streaming is very brittle as langchain (in my understanding) was not built around that concept to begin with.

What you could do is to override cl.AsyncLangchainCallbackHandler to add custom logic to trigger the stream final answer flag when you specifically reach the tool you know is return_direct=True.