dataprofessor / llama2

This chatbot app is built using the Llama 2 open source LLM from Meta.
https://llama2.streamlit.app
301 stars 559 forks source link

Don't have to simulate typewriter effect in streamlit_app.py #8

Open jliu015 opened 4 months ago

jliu015 commented 4 months ago

In streamlit_app.py, typewriter effect is simulated by the following code. response = generate_llama2_response(prompt) placeholder = st.empty() full_response = '' for item in response: full_response += item placeholder.markdown(full_response) placeholder.markdown(full_response)

That's NOT a neat implementation because you can't display the partial response until the full response is received, esp. for the long response. The good example is available on replicate website. https://replicate.com/docs/get-started/python.

Some models stream output as the model is running. They will return an iterator, and you can iterate over that output

iterator = replicate.run( "mistralai/mixtral-8x7b-instruct-v0.1", input={"prompt": "Who was Dolly the sheep?"}, ) for text in iterator: print(text)

Hopefully llama2 can return an iterator.