dataprofessor / llama2

This chatbot app is built using the Llama 2 open source LLM from Meta.
301 stars 559 forks source link

Don't have to simulate typewriter effect in #8

Open jliu015 opened 4 months ago

jliu015 commented 4 months ago

In, typewriter effect is simulated by the following code. response = generate_llama2_response(prompt) placeholder = st.empty() full_response = '' for item in response: full_response += item placeholder.markdown(full_response) placeholder.markdown(full_response)

That's NOT a neat implementation because you can't display the partial response until the full response is received, esp. for the long response. The good example is available on replicate website.

Some models stream output as the model is running. They will return an iterator, and you can iterate over that output

iterator = "mistralai/mixtral-8x7b-instruct-v0.1", input={"prompt": "Who was Dolly the sheep?"}, ) for text in iterator: print(text)

Hopefully llama2 can return an iterator.