Closed thomasgauvin closed 6 months ago
For folks who are running into this issue, you can achieve better streaming by using server-sent events for your streaming mechanism. This can be toggled by setting the 'Content-Type' as 'text/event-stream'. Here's the updated code snippet:
[...]
app.http('streamPoem', {
methods: ['GET', 'POST'],
authLevel: 'anonymous',
handler: async (request, context) => {
context.log(`Http function processed request for url "${request.url}"`);
const shortPoem = `
Roses are red,
Violets are blue,
Sugar is sweet,
And so are you.
`
const poem = shortPoem.repeat(20);
const delayedStream = ReadableStream.from(stringToDelayedStream(poem, 100))
return {
body: delayedStream,
headers: {
'Content-Type': 'text/event-stream'
}
}
}
});
[...]
Resulting in:
Hope this helps!
I'm experiencing the same, despite using Server-Side Events. @thomasgauvin, which OS and SKU is your function app using? I'm seeing this on Windows Y1 (consumption plan).
I'm also using this within an OpenAI chat completions use case and needed to convert its streaming API:
import { app } from "@azure/functions";
import { OpenAIClient, AzureKeyCredential } from "@azure/openai";
app.setup({ enableHttpStream: true });
app.http("chatbot", {
methods: ["POST"],
authLevel: "function",
handler: async (request) => {
const body = await request.json();
const question = body?.question;
if (!question) {
return {
status: 400,
};
}
const openai = new OpenAIClient(endpoint, credential);
const eventStream = await openai.streamChatCompletions("gpt-4", [
{ role: "system", content: "System prompt..." },
{ role: "user", content: question },
]);
const stream = new ReadableStream({
async start(controller) {
for await (const event of eventStream) {
controller.enqueue(event?.choices?.[0]?.delta?.content ?? "");
}
controller.close();
return;
},
});
return {
body: stream,
headers: {
"Content-Type": "text/event-stream",
},
};
},
});
For OpenAI workloads, content filtering is enabled by default on Azure OpenAI, whereas its a separate API on OpenAI.
After applying for modified content filtering and enabling asynchronous mode (or disabling content filtering), streaming performance significantly improved.
@tlvince and @thomasgauvin sounds like you both found a solution in your own code, so I suspect there's not a bug inherent to the http stream feature. Let me know if that's not true and I can re-open this
I'm using Node.js v4 functions to stream OpenAI chat completion responses. Following the instructions from the blog post of the announcement and the docs, my streamed responses hang for about 5 seconds before starting to stream. Here is a gif of what is happening:
Is there any way to ensure that chunks start streaming as soon as they are emitted to the Functions host instead of wait for 5 seconds?
Here is the code: