When responses are written back piece-by-piece through Server Sent Events, the gateway was buffering all of the text then sending it at once, or missing some data. This change implements custom handling for the content-type of text/event-stream
How Has This Been Tested?
from flask import stream_with_context, request,Response
import requests
from langchain_community.chat_models import ChatOpenAI
from os import environ
environ["OPENAI_API_KEY"] = "Bearer foo"
chat_model = ChatOpenAI(
model="gpt-3.5-turbo",
openai_api_base="https://openai.inlets.dev/v1",
)
def handle(app, req):
def stream():
for chunk in chat_model.stream("You are a helpful AI assistant, try your best to help, respond with truthful answers, but if you don't know the correct answer, say so. Answer this question: What are the top 5 countries by GDP?"):
print(chunk.content+"\n", flush=True, end="")
yield f'data: {chunk.content}\n\n'
return Response(stream(), mimetype='text/event-stream')
Types of changes
[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to change)
Description
Support streaming responses from functions
Motivation and Context
When responses are written back piece-by-piece through Server Sent Events, the gateway was buffering all of the text then sending it at once, or missing some data. This change implements custom handling for the content-type of
text/event-stream
How Has This Been Tested?
Types of changes