Can't display LangChain streaming response in Streamlit (python app framework)

kittipatkampa commented 1 year ago

I'm trying to display the streaming output from ChatGPT api to Streamlit (a python web app framework). The goal is to make it feel like the bot is typing back to us. However, langchain.chat_models.ChatOpenAI does not output a generator that we can use in Streamlit. Is there anyway I can make this effect in Streamlit using LangChain?

This is what I tried with LangChain. The command llm_ChatOpenAI(messages) in the code snippet below does streaming effect in the background and python console, but not in Streamlit UI.

import streamlit as st
from langchain.chat_models import ChatOpenAI
from langchain.callbacks.base import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.schema import HumanMessage

OPENAI_API_KEY = 'XXX'
model_name = "gpt-4-0314"
user_text = "Tell me about Seattle in 10 words."

llm_ChatOpenAI = ChatOpenAI(
    streaming=True, 
    verbose=True,
    temperature=0.0,
    model=model_name,
    openai_api_key=OPENAI_API_KEY,
    callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), 
    )

messages = [
    HumanMessage(content=user_text)
]

# [Doesn't work] Looping over the response for "streaming effect"
for resp in llm_ChatOpenAI(messages):
    st.write(resp)

Since the code above does not work, I have to use the plain vanilla openai, which is a generator, without LangChain wrapper. The code below will display streaming effect with Streamlit:

import openai
openai.api_key = OPENAI_API_KEY
llm_direct = openai.ChatCompletion.create(
            model=model_name, 
            messages=[{"role": "user", "content": user_text}],
            temperature=0.0,
            max_tokens=50,
            stream = True,
        )

tokens = []
# Works well -- Looping over the response for "streaming effect"
for resp in llm_direct:
    if resp.get("choices") and resp["choices"][0].get("delta") and resp["choices"][0]["delta"].get("content"):
        tokens.append( resp["choices"][0]["delta"]["content"] )
        result = "".join(tokens)
        st.write(result)

I attached the streamlit code that you can run and experiment below:

save the file below as main.py
install dependencies pip install openai streamlit
run the code using streamlit run main.py
```
# main.py
```

OPENAI_API_KEY = 'YOUR API KEY' user_text = "Tell me about Seattle in 10 words."

def render(): import streamlit as st st.set_page_config(layout="wide") st.write("Welcome to GPT Applications!")

model_name = st.radio("Choose a model", ["text-davinci-003", "gpt-4-0314", "gpt-3.5-turbo"])
api_choice = st.radio("Choose an API", ["openai.ChatCompletion", "langchain.llms.OpenAI", "langchain.chat_models.ChatOpenAI"])
res_box = st.empty()

if st.button("Run", type='primary'):
    if api_choice == "openai.ChatCompletion":
        import openai
        openai.api_key = OPENAI_API_KEY
        llm_direct = openai.ChatCompletion.create(
                    model=model_name, 
                    messages=[{"role": "user", "content": user_text}],
                    temperature=0.0,
                    max_tokens=50,
                    stream = True,
                )

        tokens = []
        # Works well -- Looping over the response for "streaming effect"
        for resp in llm_direct:
            if resp.get("choices") and resp["choices"][0].get("delta") and resp["choices"][0]["delta"].get("content"):
                tokens.append( resp["choices"][0]["delta"]["content"] )
                result = "".join(tokens)
                res_box.write(result) 

    elif api_choice == "langchain.llms.OpenAI":
        from langchain.llms import OpenAI

        llm_OpenAI = OpenAI(
            streaming=True, 
            verbose=True, 
            temperature=0.0,
            model_name=model_name,
            openai_api_key=OPENAI_API_KEY
        )

        response = llm_OpenAI.stream(user_text)  # returns a generator. However, it doesn't work with model_name='gpt-4'
        tokens = []
        for resp in response:
            tokens.append(resp["choices"][0]["text"])
            result = "".join(tokens)
            res_box.write(result)

    elif api_choice == "langchain.chat_models.ChatOpenAI":
        from langchain.chat_models import ChatOpenAI
        from langchain.callbacks.base import CallbackManager
        from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
        from langchain.schema import HumanMessage

        llm_ChatOpenAI = ChatOpenAI(
            streaming=True, 
            verbose=True,
            temperature=0.0,
            model=model_name,
            openai_api_key=OPENAI_API_KEY,
            callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]), 
            )

        messages = [
            HumanMessage(content=user_text)
        ]

        # [Doesn't work] Looping over the response for "streaming effect"
        for resp in llm_ChatOpenAI(messages):
            res_box.write(resp)

if name == 'main': render()

goldengrape commented 1 year ago

I Solved it!

from langchain.callbacks.base import BaseCallbackHandler
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
import streamlit as st

class StreamHandler(BaseCallbackHandler):
    def __init__(self, container, initial_text=""):
        self.container = container
        self.text=initial_text
    def on_llm_new_token(self, token: str, **kwargs) -> None:
        # "/" is a marker to show difference 
        # you don't need it 
        self.text+=token+"/" 
        self.container.markdown(self.text) 

query=st.text_input("input your query",value="Tell me a joke")
ask_button=st.button("ask") 

st.markdown("### streaming box")
# here is the key, setup a empty container first
chat_box=st.empty() 
stream_handler = StreamHandler(chat_box)
chat = ChatOpenAI(max_tokens=25, streaming=True, callbacks=[stream_handler])

st.markdown("### together box")  

if query and ask_button: 
    response = chat([HumanMessage(content=query)])    
    llm_response = response.content  
    st.markdown(llm_response)

Make a custom handler, pass a streamlit container to it, and write markdown inside that contrainer

schwarbf commented 1 year ago

Could you make this work in streamlit-chat?

goldengrape commented 1 year ago

@schwarbf

This is still a bit difficult for me, I am still learning. I am more used to this display layout:

human input box
AI current response box
All chat logs in chronological order

you can see the demo: https://advisors-alliance.streamlit.app/ (It is still only in Chinese, I will add support for various languages later, after all, it is not difficult to achieve multiple languages with the help of ChatGPT.)

BTW, "stream" text to speak can also support now: https://gist.github.com/goldengrape/84ce3624fd5be8bc14f9117c3e6ef81a It's a real pain to have to automate the voice on streamlit cloud.

sfc-gh-jcarroll commented 1 year ago

This came up on Twitter, here's an example using the new Streamlit native Chat UI with the custom handler provided above

https://langchain-streaming-example.streamlit.app/

https://github.com/langchain-ai/streamlit-agent/blob/main/streamlit_agent/basic_streaming.py

We just shipped a native integration for langchain agents, won't yet support this basic use case but will look at how to add it. Hope that helps! https://python.langchain.com/docs/modules/callbacks/integrations/streamlit

jmtatsch commented 1 year ago

@sfc-gh-jcarroll Great additions. Would be awesome if RetrievalQA and ConversationalRetrievalChain would get compatible as well.

sfc-gh-jcarroll commented 11 months ago

Additional examples have been added to https://github.com/langchain-ai/streamlit-agent including Retrieval Chain (Document upload), SQL agent, Pandas agent. Enjoy!

hikmet-demir commented 11 months ago

@sfc-gh-jcarroll god bless guys! I was cracking my head over 2 hours to stream it over streamlit, I use the retrieval chain example and it solved issue!

sfc-gh-jesmith commented 4 months ago

Hey all,

Wanted to share that there’s a new command, st.write_stream, out in the latest 1.31.0 release to conveniently handle generators and streamed responses for your chat apps. Check it out in the docs. 🤩

Thanks again for sharing your input to help shape the roadmap and make Streamlit better! Feel free to let others know about this new update and if you build something cool with it, let us know on the Forum!

langchain-ai / chat-langchain

Can't display LangChain streaming response in Streamlit (python app framework) #39