Closed raphaelrk closed 1 year ago
kudos to @ponytojas for the regex
const delta = decoder.decode(value).match(/"delta":\s*({.*?"content":\s*".*?"})/)?.[1]
Related to @ponytojas solution I have found that most of the time the first response contains several token blocks, so when the regular expression is applied it only takes the first one and discards the others
I've added this function for decoding:
const utf8Decoder = new TextDecoder('utf-8')
const decodeResponse = (response?: Uint8Array) => {
if (!response) {
return ''
}
const pattern = /"content"\s*:\s*"([^"]*)"/g
const decodedText = utf8Decoder.decode(response)
const matches: string[] = []
let match
while ((match = pattern.exec(decodedText)) !== null) {
matches.push(match[1])
}
return matches.join('')
}
And used it on read()
function of the approach, also remove JSON.Parse()
:
...
async function read() {
const { value, done } = await reader.read()
if (done) return onText(fullText)
const delta = decodeResponse(value)
if (delta) {
fullText += delta
//Detects punctuation, if yes, fires onText once per .5 sec
if (/[\p{P}\p{S}]/u.test(delta)) {
const now = Date.now()
if (now - lastFire > 500) {
lastFire = now
onText(fullText)
}
}
}
await read()
}
...
Now I'm getting all the tokens:
I hope you find it helpful
@fracergu great solution, but the regex doesn't work when the content have double quote in them (an escaped character). Like "content": "\"" it just shows \ which is wrong. It should show "
@tsenguunchik Yes, I made the publication too fast and I'm just now struggling with it. I didn't see the problem until I started requesting code and it started failing to receive double quotes. I will update here when I get a solution.
EDIT: Problem solved, I recovered the original regex and parsing and now it works pretty well
const decodeResponse = (response?: Uint8Array) => {
if (!response) {
return ''
}
const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g
const decodedText = utf8Decoder.decode(response)
const matches: string[] = []
let match
while ((match = pattern.exec(decodedText)) !== null) {
matches.push(JSON.parse(match[1]).content)
}
return matches.join('')
}
Also is not losing the few first tokens that come "in pack"
Here you have the custom hook I'm using with React, in case you find it useful.
setInputMessages
receives the list of messages from which we expect a completion and triggers the fetch. partialText
returns the partial text of the response as it is received in real time, and finally fullText
returns the response when it is complete. You have the types I use at the beginning of the file. It still lacks error handling, as it is still under development.
I'm sorry if there is any bad practice or incorrectness, as I'm fairly new to React.
import { useState, useEffect } from 'react'
enum Role {
ASSISTANT = 'assistant',
USER = 'user',
}
type Message = {
role: Role
content: string
}
const API_URL = 'https://api.openai.com/v1/chat/completions'
const OPENAI_API_KEY = import.meta.env.VITE_OPENAI_API_KEY
const OPENAI_CHAT_MODEL = 'gpt-3.5-turbo'
const utf8Decoder = new TextDecoder('utf-8')
const decodeResponse = (response?: Uint8Array) => {
if (!response) {
return ''
}
const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g
const decodedText = utf8Decoder.decode(response)
const matches: string[] = []
let match
while ((match = pattern.exec(decodedText)) !== null) {
matches.push(JSON.parse(match[1]).content)
}
return matches.join('')
}
export const useStreamCompletion = () => {
const [partialText, setPartialText] = useState('')
const [fullText, setFullText] = useState('')
const [inputMessages, setInputMessages] = useState<Message[]>([])
const abortController = new AbortController()
useEffect(() => {
if (!inputMessages.length) return
const onText = (text: string) => {
setPartialText(text)
}
const fetchData = async () => {
try {
const response = await fetch(API_URL, {
method: 'POST',
headers: {
Authorization: `Bearer ${OPENAI_API_KEY}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
messages: inputMessages,
model: OPENAI_CHAT_MODEL,
stream: true,
}),
signal: abortController.signal, // assign the abort controller signal to the fetch request
})
if (!response.ok) {
const error = await response.json()
throw new Error(error.error)
}
if (!response.body) throw new Error('No response body')
const reader = response.body.getReader()
let fullText = ''
async function read() {
const { value, done } = await reader.read()
if (done) return onText(fullText)
const delta = decodeResponse(value)
if (delta) {
fullText += delta
onText(fullText.trim())
}
await read()
}
await read()
setFullText(fullText)
} catch (error) {
console.error(error)
}
}
fetchData()
return () => {
abortController.abort()
}
}, [inputMessages])
return { partialText, fullText, setInputMessages }
}
I found this code from Hassan at Vercel helpful for streaming the OpenAI api in an edge function
@darknoon Yes, it works. But I created an issue with it, because it didn't handle the error chunk. https://github.com/Nutlope/twitterbio/issues/25
Does anybody know why the chunks have been split when I deployed it to vercel edge function?
@shezhangzhang This chunk-split issue has only happened to me once so far in an hour of testing. I'm running a Node.js instance on localhost right now, not on Vercel. Perhaps it's a rare occurrence, but an occurrence we have to account for nonetheless.
@shezhangzhang This chunk-split issue has only happened to me once so far in an hour of testing. I'm running a Node.js instance on localhost right now, not on Vercel. Perhaps it's a rare occurrence, but an occurrence we have to account for nonetheless.
Yup, I didn't figured it out. When you deployed it on Vercel, the chunk-split issue
could happen with every request. I don't know the reason๐ฅฒ
For me, this doesn't solve the issue with the unescaped " character. Any suggestions?
EDIT:
this happens with the following input.
data: {"id":"chatcmpl-6wbcJjX6ttujYAT6rFv8eZMxS2daD","object":"chat.completion.chunk","created":1679425643,"model":"gpt-3.5-turbo-0301","choices":[{"delta":{"content":"\"}"},"index":0,"finish_reason":null}]}
@tsenguunchik Yes, I made the publication too fast and I'm just now struggling with it. I didn't see the problem until I started requesting code and it started failing to receive double quotes. I will update here when I get a solution.
EDIT: Problem solved, I recovered the original regex and parsing and now it works pretty well
const decodeResponse = (response?: Uint8Array) => { if (!response) { return '' } const pattern = /"delta":\s*({.*?"content":\s*".*?"})/g const decodedText = utf8Decoder.decode(response) const matches: string[] = [] let match while ((match = pattern.exec(decodedText)) !== null) { matches.push(JSON.parse(match[1]).content) } return matches.join('') }
Also is not losing the few first tokens that come "in pack"
I have a problem with encoding when making requests.
For this text:
Remove citation from this text: Gardening can be defined as an activity in a garden setting to grow, cultivate, and look after plants (e.g., flowers, vegetables) for non-commercial use (Gillard, 2001, p. 832; Kingsley et al., 2021). There is some evidence to suggest gardening is a moderately intense physical activity ranging from low-to moderate-intensity for older age groups (>63 years; Park et al., 2008; Park et al., 2011) to moderate-to high-intensity in younger adults (>20 years; Park et al., 2014)
The request fails before sending it around the text
(>20 years
I tried with encodeURIComponent
and works fine, but in return the API says among others:
just a reminder to please use proper formatting
Furthermore, sometimes the response is itself encoded.
I'm not sure if this is useful for you guys, but here's my modified version of eventsource with added support for setting method and payload/body which should be sufficient to have a fully featured SSE connection to openai https://gist.github.com/FurriousFox/f43eaf9645302e51ab01cf0b1853aa4e
something like this should then work
const EventSource = require("./eventsource.js");
let es = new EventSource("https://api.openai.com/v1/chat/completions", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": `Bearer ${api_key}`
},
payload: JSON.stringify({
"model": "gpt-3.5-turbo",
"messages": [{
"role": "system",
"content": instruction
}, {
"role": "user",
"content": prompt
}],
stream: true,
}),
});
es.onmessage = (e) => {
if (e.data == "[DONE]") {
es.close();
} else {
let delta = JSON.parse(e.data).choices[0].delta.content;
if (delta) {
console.log(delta);
}
}
};
Ok just my 2 cents, Network fishyness and the packet failing or the actual server randomly optimizing. I predict this is one of the reasons openai responses fail sometimes in chat.openai.com anyhow. The solution may include a library that very flexibly parses the input and ensure text is joined and any random input is accepted and pares in a most forgiving way. if i remember correctly jsmn is a c++ lib if anyone has a super clean c implementation of this I would be interested pr ++ anyhow I think the regex above a ways is probably the best middle ground they nerfed the response stats anyways. Happy coding
On Wed, Mar 22, 2023 at 2:15โฏPM Micha @.***> wrote:
I'm not sure if this is useful for you guys, but here's my modified version of eventsource https://www.npmjs.com/package/eventsource with added support for setting method and payload/body which should be sufficient to have a fully featured SSE connection to openai https://gist.github.com/FurriousFox/f43eaf9645302e51ab01cf0b1853aa4e
something like this should then work
const EventSource = require("./eventsource.js"); let es = new EventSource("https://api.openai.com/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization":
Bearer ${api_key}
}, payload: JSON.stringify({ "model": "gpt-3.5-turbo", "messages": [{ "role": "system", "content": instruction }, { "role": "user", "content": prompt }], stream: true, }),});es.onmessage = (e) => { if (e.data == "[DONE]") { es.close(); } else { let delta = JSON.parse(e.data).choices[0].delta.content; if (delta) { console.log(delta); } }};โ Reply to this email directly, view it on GitHub https://github.com/openai/openai-node/issues/18#issuecomment-1480271262, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHJVDN6FWH33CNKDGYBKJ3W5NTYNANCNFSM54BFHL3Q . You are receiving this because you commented.Message ID: @.***>
There are examples above that show how to consume the token stream with events and callbacks, but a potentially simpler alternative is to use an async iterator that yields tokens, even if you ultimately want to just push each token into a stream or callback anyways.
Usage:
const messages = [
{ role: 'user', content: 'what is the meaning of life?' }
]
let answer = ''
for await (const token of streamChatCompletion(messages)) {
answer += token
// do something async here if you want
}
console.log('answer finished:', answer)
The implementation is easy because an Axios' response set to streaming already lets you consume its chunks with an async iterator:
const { OpenAIApi } = require('openai')
const openai = new OpenAIApi(...)
async function* streamChatCompletion(messages) {
const response = await openai.createChatCompletion(
{
model: 'gpt-3.5-turbo',
messages,
stream: true,
},
{
responseType: 'stream',
},
)
for await (const chunk of response.data) {
const lines = chunk
.toString('utf8')
.split('\n')
.filter((line) => line.trim().startsWith('data: '))
for (const line of lines) {
const message = line.replace(/^data: /, '')
if (message === '[DONE]') {
return
}
const json = JSON.parse(message)
const token = json.choices[0].delta.content
if (token) {
yield token
}
}
}
}
You can see this impl in my Telegram bot: https://github.com/danneu/telegram-chatgpt-bot/blob/24b76f880094b87a5c0a9a42c3571bbecfb12caa/openai.ts#L25
Or, instead of returning an async iterator, you just replace yield token
with stream.push(token)
or onToken(token)
or whatever makes most sense for your app (remembering to change function*
back to function
).
I released https://github.com/drorm/gish which has a full example of streaming. If you just want to see it in action, just click on the screencast in the README. The code is at https://github.com/drorm/gish/blob/c88d39fcc97150d3107cf84eca3051fa3b18cd14/src/LLM.ts#L103 , based on a lot of the info in here. Thank you. My license is so steal at your pleasure, MIT :-).
@danneu great solution, thank you for sharing.
for await (const chunk of response.data) {
@danneu @Christopher-Hayes VSCode complains with Type 'CreateChatCompletionResponse' must have a '[Symbol.asyncIterator]()' method that returns an async iterator.
on this line -- is there something else you guys are doing to ensure you're getting an async iterator? Hacking it with an any
typecast doesn't magically fix things either.
for await (const chunk of response.data) {
@danneu @Christopher-Hayes VSCode complains with
Type 'CreateChatCompletionResponse' must have a '[Symbol.asyncIterator]()' method that returns an async iterator.
on this line -- is there something else you guys are doing to ensure you're getting an async iterator? Hacking it with anany
typecast doesn't magically fix things either.
Yeah, idk why TS was giving that error. But, I just casted to unknown
and then to the async iterator type it wants.
I used the code here if you want an example implementation: https://github.com/Christopher-Hayes/vscode-chatgpt-reborn/blob/3c191f34b52e1473a171531da42177857f15304c/src/api-provider.ts#L42
Does there exists a way to safely JSON.parse
the response
and ensure that you get all the content
provided by OpenAI? The methods above all fails in one way or another which is particularly problematic when generating code snippets but also markdown.
I'm using code that's roughly like this:
let response = await fetch("https://api.openai.com/v1/engines/davinci/completions",
{
headers: {
"Content-Type": "application/json",
Authorization: "Bearer " + OPENAI_KEY,
},
method: "POST",
body: JSON.stringify({
prompt: selected,
temperature: 0.75,
top_p: 0.95,
max_tokens: 10,
stream: true,
stop: ["\n\n"],
}),
}
);
const reader = es.body?.pipeThrough(new TextDecoderStream()).getReader();
while (true) {
const res = await reader?.read();
if (res?.done) break;
console.log(res?.value);
}
Look at the console.log
output to see the format - you have to trim the data:
part off, and then JSON.parse()
it.
I've no idea why none of the anwers posted before worked for me, what ended up working is something similar to what https://github.com/openai/openai-node/issues/18#issuecomment-1352010682 said.
So in case there's someone else still looking for another options, here's what ended up working for me:
const chat= async (
prompt: string,
previousChats: IpreviousChats[],
) => {
const apiKey = window.localStorage.getItem(LOCAL_STORAGE_KEY);
const url = "https://api.openai.com/v1/chat/completions";
const xhr = new XMLHttpRequest();
xhr.open("POST", url);
xhr.setRequestHeader("Content-Type", "application/json");
xhr.setRequestHeader("Authorization", "Bearer " + apiKey);
xhr.onprogress = function(event) {
console.log("Received " + event.loaded + " bytes of data.");
console.log("Data: " + xhr.responseText);
const newUpdates = xhr.responseText
.replace("data: [DONE]", "")
.trim()
.split('data: ')
.filter(Boolean)
const newUpdatesParsed = newUpdates.map((update) => {
const parsed = JSON.parse(update);
return parsed.choices[0].delta?.content || '';
}
);
const newUpdatesJoined = newUpdatesParsed.join('')
console.log('current message so far',newUpdatesJoined);
};
xhr.onreadystatechange = function() {
if (xhr.readyState === 4) {
if (xhr.status === 200) {
console.log("Response complete.");
console.log("Final data: " + xhr.responseText);
} else {
console.error("Request failed with status " + xhr.status);
}
}
};
const data = JSON.stringify({
model: currentChatModelRef.current,
messages: [
...previousChats,
{
role: "user",
content: prompt,
}],
temperature: 0.5,
frequency_penalty: 0,
presence_penalty: 0,
stream: true,
});
xhr.send(data);
}
good luck everyone :DDD
@josephrocca Still sometimes getting the JSON.parse
error in production using Vercel Edge Runtime.
For server-side typescript can try:
const response = await openai.createChatCompletion({
model: "gpt-3.5-turbo",
messages: messages,
stream: true,
}, { responseType: 'stream' });
const stream = response.data as unknown as IncomingMessage
stream.on('data', (chunk: Buffer) => {
// Messages in the event stream are separated by a pair of newline characters.
const payloads = chunk.toString().split("\n\n")
for (const payload of payloads) {
if (payload.includes('[DONE]')) return;
if (payload.startsWith("data:")) {
const data = payload.replaceAll(/(\n)?^data:\s*/g, ''); // in case there's multiline data event
try {
const delta = JSON.parse(data.trim())
console.log(delta.choices[0].delta?.content)
} catch (error) {
console.log(`Error with JSON.parse and ${payload}.\n${error}`)
}
}
}
})
stream.on('end', () => console.log('Stream done'))
stream.on('error', (e: Error) => console.error(e))
Create util.ts
// import what you need in util.ts
import {
OpenAIApi,
Configuration,
ChatCompletionRequestMessage,
CreateChatCompletionResponse,
} from 'openai'
// config openAI sdk
const openai = new OpenAIApi(
new Configuration({
apiKey: process.env.OPENAI_API_KEY,
basePath: process.env.OPENAI_PROXY
})
)
// write a util function, chat
export default {
async chat(messages: ChatCompletionRequestMessage[], stream: boolean = false) {
const responseType: ResponseType = stream ? 'stream' : 'json'
return (
await openai.createChatCompletion(
{
model: 'gpt-3.5-turbo',
messages,
stream
},
{ responseType }
)
).data
}
}
Install json detector
yarn add @stdlib/assert-is-json
Use util.ts
in a controller or service ware
import openai from './openai'
import isJSON from '@stdlib/assert-is-json'
import { IncomingMessage } from 'http'
async chatStream(content: string, callback: CreateChatCompletionStreamResponseCallback) {
const role = ChatCompletionRequestMessageRoleEnum.User
// Transfer to IncomingMessage type, this is a Stream type
const res = ((await openai.chat([{ role, content }], true)) as any) as IncomingMessage
let tmp = '' // cache, store temporary string data
res.on('data', (data: Buffer) => {
// buffer to utf8 string, then split to string data array
const message = data
.toString('utf8')
.split('\n')
.filter(m => m.length > 0)
for (const item of message) {
// remove the first head word: 'data: '
tmp += item.replace(/^data: /, '')
// only when tmp string is a json, you transfer to object and callback it
if (isJSON(tmp)) {
const data: CreateChatCompletionStreamResponse = JSON.parse(tmp)
tmp = ''
callback(data)
}
}
})
}
Interface file for openAI, Interface.ts
interface CreateChatCompletionStreamResponse {
id: string
object: string
created: number
model: string
choices: Array<CreateChatCompletionStreamResponseChoicesInner>
}
interface CreateChatCompletionStreamResponseChoicesInner {
delta: { role?: string; content?: string }
index: number
finish_reason: string
}
type CreateChatCompletionStreamResponseCallback = (response: CreateChatCompletionStreamResponse) => void
Web Client Streaming
I was trying to access the OpenAI API from a web-client, and none of the solutions worked except @YoseptF - https://github.com/openai/openai-node/issues/18#issuecomment-1490970390
I believe it has to do with the client-side Axios implementation, but I may be wrong. Either way, thanks @YoseptF !
Here's my contribution ๐๐ผ
// Initiate the stream request.
const response = await fetch(/* <url> */, {
method: 'POST',
headers: { /* <headers> */ },
body: JSON.stringify({
...{ stream: true },
...restOfBody
})
});
if (!response?.body) {
throw new Error('The response from OpenAI does not contain a body.');
}
// OpenAI responses seem to always begin with two newlines. We'll ignore those.
let noise = true;
// @ts-ignore
for await (const message of response.body) {
for (const chunk of message.toString().split('\n\n')) {
// https://github.com/openai/openai-node/issues/18#issuecomment-1369996933
// https://github.com/openai/openai-node/issues/18#issuecomment-1493132878
const msg: string = chunk.replace(/^data: /, '')
// The stream is done?
// https://platform.openai.com/docs/api-reference/chat/create
if (msg === '[DONE]' || msg.length === 0) {
continue;
}
let parsed: any;
try {
parsed = JSON.parse(msg.trim());
} catch (e) {
throw new Error(`Could not parse OpenAI response (${msg}).`);
}
let choice: string;
try {
choice = parsed?.choices?.[0]?.text;
} catch (e) {
throw new Error(`Could not dereference OpenAI message format (${parsed}).`);
}
if (noise && choice === '\n') {
continue;
}
noise = false;
// your final message will be here.
}
}
NOTE: Sending { stop: ['\n\n'] }
as part of the request did not work for me.
Using a pattern similar to that shown in https://github.com/openai/openai-node/issues/18#issuecomment-1493132878 in a client-side React app I'm getting stream.on is not a function
I assume this is because Axios appears to not support { responseType: 'stream' }
in the browser, only in server-side node
Yeah so @justinsteven I'm running into the same issue using client-side React just to build something simple out. Here is the best work around I could find but I'll show you the issue with it in just a second:
Would recommend adding this to a button click in React and have your state altered based on it:
await axios.post(
"https://api.openai.com/v1/chat/completions",
{
messages: newMessages,
stream: true,
model: "gpt-3.5-turbo",
},
{
headers: {
Authorization: "Bearer " + getOpenAISK(),
},
onDownloadProgress: (event) => {
const payload = event.currentTarget.response;
const result = payload
.replace(/data:\s*/g, "")
.replace(/[\r\n\t]/g, "")
.split("}{")
.join("},{");
const cleanedJsonString = `[${result}]`;
if (payload.includes("[DONE]")) return;
try {
const parsedJson: [] = JSON.parse(cleanedJsonString);
// console.log(JSON.stringify(parsedJson, null, 2));
let newContent: string = ""
parsedJson.forEach((item: any) => {
if (
item.choices &&
item.choices.length > 0 &&
item.choices[0].delta &&
item.choices[0].delta.content
) {
newContent += item.choices[0].delta.content;
}
});
if (newContent !== laggingLatestMessage) {
const extraContent = newContent.slice(
laggingLatestMessage.length
);
laggingLatestMessage += extraContent;
setLatestMessage(laggingLatestMessage);
}
} catch (e) {
setIsGenerating(false);
console.log("error parsing json", e);
}
},
responseType: "stream",
}
);
Unfortunately, sometimes the output will come ugly looking like this, but I believe this is the best work around when using axios stream in client side React.
Hey everyone! After some tinkering, I've created a working client and server side solution for this. The GitHub project is here, and you can try the client/browser demo here. (Demo is in React but solution is framework agnostic)
You can now drop in support for streaming chat completions in both the server (Node.js) and client (browser) via the npm package openai-ext. This solution was inspired by everyone's work above, especially @YoseptF and @edelauna. Thanks everyone for working on this together.
This solution supports stopping completions, too.
Full usage examples below.
To install via npm:
npm i openai-ext@latest
Use the following solution in a browser environment:
import { OpenAIExt } from "openai-ext";
// Configure the stream (use type ClientStreamChatCompletionConfig for TypeScript users)
const streamConfig = {
apiKey: `123abcXYZasdf`, // Your API key
handler: {
// Content contains the string draft, which may be partial. When isFinal is true, the completion is done.
onContent(content, isFinal, xhr) {
console.log(content, "isFinal?", isFinal);
},
onDone(xhr) {
console.log("Done!");
},
onError(error, status, xhr) {
console.error(error);
},
},
};
// Make the call and store a reference to the XMLHttpRequest
const xhr = OpenAIExt.streamClientChatCompletion(
{
model: "gpt-3.5-turbo",
messages: [
{ role: "system", content: "You are a helpful assistant." },
{ role: "user", content: "Tell me a funny joke." },
],
},
streamConfig
);
// If you'd like to stop the completion, call xhr.abort(). The onDone() handler will be called.
xhr.abort();
Use the following solution in a Node.js or server environment:
import { Configuration, OpenAIApi } from 'openai';
import { OpenAIExt } from "openai-ext";
const apiKey = `123abcXYZasdf`; // Your API key
const configuration = new Configuration({ apiKey });
const openai = new OpenAIApi(configuration);
// Configure the stream (use type ServerStreamChatCompletionConfig for TypeScript users)
const streamConfig = {
openai: openai,
handler: {
// Content contains the string draft, which may be partial. When isFinal is true, the completion is done.
onContent(content, isFinal, stream) {
console.log(content, "isFinal?", isFinal);
},
onDone(stream) {
console.log('Done!');
},
onError(error, stream) {
console.error(error);
},
},
};
const axiosConfig = {
// ...
};
// Make the call to stream the completion
OpenAIExt.streamServerChatCompletion(
{
model: 'gpt-3.5-turbo',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Tell me a funny joke.' },
],
},
streamConfig,
axiosConfig
);
If you'd like to stop the completion, call stream.destroy()
. The onDone()
handler will be called.
const response = await OpenAIExt.streamServerChatCompletion(...);
const stream = response.data;
stream.destroy();
You can also stop completion using an Axios cancellation in the Axios config.
Let me know if you have any suggestions. PRs always welcome! I'll be adding a live demo to the project's Storybook site soon.
Update: I've added support for Node.js/server chat completion streaming. This library now supports both server and client. Woohoo ๐
Update 2: Live React demo now available -- view here
Nice work @justinmahar your solution is super easy to use and works great with Next.js and React apps ๐ฏ
How to implement this thing so that the OpenAI API request will be made from the backend because I do not want to expose the API key to the client side?
Hey @jaankoppe, you unfortunately wouldn't be able to use this server side (this is more so a client side solution to play around with a local chatGPT bot). I would recommend using the solutions above, as they should all work with axios and server side chat completions if needed it. Use this solution as a reference:
https://github.com/openai/openai-node/issues/18#issuecomment-1493132878
@jaankoppe @shreypjain I've updated the solution to support both server and client streaming. Give it a shot and let us know how it works for you - https://github.com/openai/openai-node/issues/18#issuecomment-1509225450
I am running those examples but keep getting : TypeError: completion.data.on is not a function Tried update my node version to the latest but same.
@Yafaa Which environment are you in (node.js or browser)? Are you using the correct call for the environment?
The latest version will throw an error when used in the wrong environment. Try it out -- npm i openai-ext@latest
for the example with openai-ext it's fine but with this https://github.com/openai/openai-node/issues/18#issuecomment-1369996933 and https://github.com/openai/openai-node/issues/18#issuecomment-1371279689 it throws TypeError: completion.data.on is not a function
I used https://github.com/PawanOsman/ChatGPT/blob/b705a2511b71cf2a6077db76a7048ddbca1ecbb1/routes.js#L178 solution on Next.js API side, it worked.
UPDATE: Browser frontend Joseph's solution worked for me
MDN NOT WORK:
And I am still working on React side following this MDN guide and thishigher level wrapper sucks, Node.js & browser native functions are winner in dealing with OpenAI's stream scenario
i finally managed to get the stream working on chrome, but not firefox. The route i picked yields tokens with the async iterator and write them as the body of the response, and that is picked up as a readable stream.
index.js
setResult('');
const response = await fetch("/api/generate", {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
querySystem: islandSystemInput,
queryUser: islandUserInput
}),
});
const reader = response.body?.pipeThrough(new TextDecoderStream()).getReader();
while (true) {
const res = await reader?.read();
if (res?.value?.toString() !== undefined){
setResult(result => result + res?.value);
}
if (res?.done) break;
}
}
generate.js
async function* streamChatCompletion(messages) {
const completion = await openai.createCompletion(
{
model: 'gpt-4-0314',
messages: messages,
max_tokens: 10,
stream: true,
stop: ["\n\n"],
},
{
responseType: 'stream',
},
)
for await (const chunk of completion.data) {
const lines = chunk
.toString('utf8')
.split('\n')
.filter((line) => line.trim().startsWith('data: '))
for (const line of lines) {
const message = line.replace(/^data: /, '')
if (message === '[DONE]') {
return
}
const json = JSON.parse(message)
const token = json.choices[0].delta.content
if (token) {
yield token;
}
}
}
}
my generate.js is mostly vanilla otherwise, just calling the above function with a for await
loop
I hope this is helpful for someone.
streamChatCompletions
Hi, I got error:
Invalid attempt to iterate non-iterable instance.
In order to be iterable, non-array objects must have a [Symbol.iterator]() method
Can u help?
i finally managed to get the stream working on chrome, but not firefox. The route i picked yields tokens with the async iterator and write them as the body of the response, and that is picked up as a readable stream.
index.js
setResult(''); const response = await fetch("/api/generate", { method: "POST", headers: { "Content-Type": "application/json", }, body: JSON.stringify({ querySystem: islandSystemInput, queryUser: islandUserInput }), }); const reader = response.body?.pipeThrough(new TextDecoderStream()).getReader(); while (true) { const res = await reader?.read(); if (res?.value?.toString() !== undefined){ setResult(result => result + res?.value); } if (res?.done) break; } }
generate.js
async function* streamChatCompletion(messages) { const completion = await openai.createCompletion( { model: 'gpt-4-0314', messages: messages, max_tokens: 10, stream: true, stop: ["\n\n"], }, { responseType: 'stream', }, ) for await (const chunk of completion.data) { const lines = chunk .toString('utf8') .split('\n') .filter((line) => line.trim().startsWith('data: ')) for (const line of lines) { const message = line.replace(/^data: /, '') if (message === '[DONE]') { return } const json = JSON.parse(message) const token = json.choices[0].delta.content if (token) { yield token; } } } }
my generate.js is mostly vanilla otherwise, just calling the above function with a
for await
loopI hope this is helpful for someone.
It will work in Firefox if u set Content-Type
header in the response of /api/generate
endpoint this is know issue still didn't fix.
There's a new node.js lib specific for streams, which should render this problem solved. (Make sure to check them out, and maybe make a PR regarding whisper.)
I've developed a custom hook in React with TypeScript to stream data from the OpenAI API. This solution is based on several approaches listed here in the comments. https://github.com/MatchuPitchu/open-ai
The custom hook allows streaming API responses, as well as canceling the response stream and resetting the messages context. Additionally, the project supports code syntax highlighting and the ability to copy code snippets to the clipboard, as well as displaying additional meta data for each response.
import { useCallback, useState } from 'react';
import type { DeepRequired } from '@/utils/type-helpers';
export type GPT35 = 'gpt-3.5-turbo' | 'gpt-3.5-turbo-0301';
export type GPT4 = 'gpt-4' | 'gpt-4-0314' | 'gpt-4-32k' | 'gpt-4-32k-0314';
export type Model = GPT35 | GPT4;
export type ChatRole = 'user' | 'assistant' | 'system' | '';
export type ChatCompletionResponseMessage = {
content: string; // content of the completion
role: ChatRole; // role of the person/AI in the message
};
export type ChatMessageToken = ChatCompletionResponseMessage & {
timestamp: number;
};
export type ChatMessageParams = ChatCompletionResponseMessage & {
timestamp?: number; // timestamp of completed request
meta?: {
loading?: boolean; // completion state
responseTime?: string; // total elapsed time between completion start and end
chunks?: ChatMessageToken[]; // returned chunks of completion stream
};
};
export type ChatMessage = DeepRequired<ChatMessageParams>;
export type ChatCompletionChunk = {
id: string;
object: string;
created: number;
model: Model;
choices: {
delta: Partial<ChatCompletionResponseMessage>;
index: number;
finish_reason: string | null;
}[];
};
type RequestOptions = {
headers: Record<string, string>;
method: 'POST';
body: string;
signal: AbortSignal;
};
export type OpenAIStreamingProps = {
apiKey: string;
model: Model;
};
const OPENAI_COMPLETIONS_URL = 'https://api.openai.com/v1/chat/completions';
const MILLISECONDS_PER_SECOND = 1000;
const updateLastItem = <T>(currentItems: T[], updatedLastItem: T) => {
const newItems = currentItems.slice(0, -1);
newItems.push(updatedLastItem);
return newItems;
};
// transform chat message structure with metadata to a limited shape that OpenAI API expects
const getOpenAIRequestMessage = ({ content, role }: ChatMessage): ChatCompletionResponseMessage => ({
content,
role
});
const getOpenAIRequestOptions = (
apiKey: string,
model: Model,
messages: ChatMessage[],
signal: AbortSignal
): RequestOptions => ({
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${apiKey}`
},
method: 'POST',
body: JSON.stringify({
model,
messages: messages.map(getOpenAIRequestMessage),
// TODO: define value: max_tokens: 100,
stream: true
}),
signal
});
// transform chat message into a chat message with metadata
const createChatMessage = ({ content, role, meta }: ChatMessageParams): ChatMessage => ({
content,
role,
timestamp: Date.now(),
meta: {
loading: false,
responseTime: '',
chunks: [],
...meta
}
});
export const useOpenAIChatStream = ({ model, apiKey }: OpenAIStreamingProps) => {
const [messages, setMessages] = useState<ChatMessage[]>([]);
const [controller, setController] = useState<AbortController | null>(null);
const [isLoading, setIsLoading] = useState<boolean>(false);
const resetMessages = () => setMessages([]);
const abortStream = () => {
// abort fetch request by calling abort() on the AbortController instance
if (!controller) return;
controller.abort();
setController(null);
};
const closeStream = (startTimestamp: number) => {
// determine the final timestamp, and calculate the number of seconds the full request took.
const endTimestamp = Date.now();
const differenceInSeconds = (endTimestamp - startTimestamp) / MILLISECONDS_PER_SECOND;
const formattedDiff = `${differenceInSeconds.toFixed(2)}s`;
// update last entry of message list with the final details
setMessages((prevMessages) => {
const lastMessage = prevMessages.at(-1);
if (!lastMessage) return [];
const updatedLastMessage = {
...lastMessage,
timestamp: endTimestamp,
meta: {
...lastMessage.meta,
loading: false,
responseTime: formattedDiff
}
};
return updateLastItem(prevMessages, updatedLastMessage);
});
};
const submitPrompt = useCallback(
async (newPrompt: ChatMessageParams[]) => {
// a) no new request if last stream is loading
// b) no request if empty string as prompt
if (isLoading || !newPrompt[0].content) return;
setIsLoading(true);
const startTimestamp = Date.now();
const chatMessages: ChatMessage[] = [...messages, ...newPrompt.map(createChatMessage)];
const newController = new AbortController();
const signal = newController.signal;
setController(newController);
try {
const response = await fetch(
OPENAI_COMPLETIONS_URL,
getOpenAIRequestOptions(apiKey, model, chatMessages, signal)
);
if (!response.body) return;
// read response as data stream
const reader = response.body.getReader();
const decoder = new TextDecoder('utf-8');
// placeholder for next message that will be returned from API
const placeholderMessage = createChatMessage({ content: '', role: '', meta: { loading: true } });
let currentMessages = [...chatMessages, placeholderMessage];
// eslint-disable-next-line no-constant-condition
while (true) {
const { done, value } = await reader.read();
if (done) {
closeStream(startTimestamp);
break;
}
// parse chunk of data
const chunk = decoder.decode(value);
const lines = chunk.split(/(\n){2}/);
const parsedLines: ChatCompletionChunk[] = lines
.map((line) => line.replace(/(\n)?^data:\s*/, '').trim()) // remove 'data:' prefix
.filter((line) => line !== '' && line !== '[DONE]') // remove empty lines and "[DONE]"
.map((line) => JSON.parse(line)); // parse JSON string
for (const parsedLine of parsedLines) {
let chunkContent: string = parsedLine.choices[0].delta.content ?? '';
chunkContent = chunkContent.replace(/^`\s*/, '`'); // avoid empty line after single backtick
const chunkRole: ChatRole = parsedLine.choices[0].delta.role ?? '';
// update last message entry in list with the most recent chunk
const lastMessage = currentMessages.at(-1);
if (!lastMessage) return;
const updatedLastMessage = {
content: `${lastMessage.content}${chunkContent}`,
role: `${lastMessage.role}${chunkRole}` as ChatRole,
timestamp: 0,
meta: {
...lastMessage.meta,
chunks: [
...lastMessage.meta.chunks,
{
content: chunkContent,
role: chunkRole,
timestamp: Date.now()
}
]
}
};
currentMessages = updateLastItem(currentMessages, updatedLastMessage);
setMessages(currentMessages);
}
}
} catch (error) {
if (signal.aborted) {
console.error(`Request aborted`, error);
} else {
console.error(`Error during chat response streaming`, error);
}
} finally {
setController(null); // reset AbortController
setIsLoading(false);
}
},
[apiKey, isLoading, messages, model]
);
return { messages, submitPrompt, resetMessages, isLoading, abortStream };
};
Inspired by @YoseptF I was able to accomplish the same using just the openai SDK. This utilized Typescript, React, and Zustand.
type ConversationMessage = CreateChatCompletionRequest["messages"][0] & {
id?: string;
};
export interface ChatCompletionSlice {
conversationMessages: ConversationMessage[];
createChatCompletion: (
request: Omit<CreateChatCompletionRequest, "model">
) => Promise<void>;
resetConversation: () => void;
}
export const createChatCompletionSlice: StateCreator<
MainSlice & ChatCompletionSlice,
[],
[],
ChatCompletionSlice
> = (set, get) => ({
conversationMessages: [],
createChatCompletion: async (request) => {
const { messages: requestMessages, stream, ...restRequest } = request;
// Add users message in
set({
conversationMessages: get().conversationMessages.concat(requestMessages),
});
const messages = get().conversationMessages.map(
({ id, ...restMessage }) => restMessage
);
const requestData: CreateChatCompletionRequest = {
messages,
stream,
...restRequest,
model: get().selectedModelId,
};
if (stream) {
openAi.createChatCompletion(requestData, {
onDownloadProgress(event: ProgressEvent) {
const target = event.target as XMLHttpRequest;
const newUpdates = target.responseText
.replace("data: [DONE]", "")
.trim()
.split("data: ")
.filter(Boolean);
let id = "";
const newUpdatesParsed: string[] = newUpdates.map((update) => {
const parsed = JSON.parse(update);
id = parsed.id;
return parsed.choices[0].delta?.content || "";
});
const newUpdatesJoined = newUpdatesParsed.join("");
const existingMessages = get().conversationMessages.map((x) => x);
const existingMessageIndex = existingMessages.findLastIndex(
(x) => x.role === "assistant" && x.id === id
);
if (existingMessageIndex !== -1) {
existingMessages[existingMessageIndex].content = newUpdatesJoined;
set({
conversationMessages: existingMessages,
});
} else {
set({
conversationMessages: existingMessages.concat([
{
role: "assistant",
content: newUpdatesJoined,
id,
},
]),
});
}
},
});
} else {
const response = await openAi.createChatCompletion(requestData);
if (response.data) {
const newMessages: CreateChatCompletionRequest["messages"] =
response.data.choices
.filter((x) => x.message !== undefined)
.map((x) => ({
role: x.message!.role,
content: x.message!.content,
}));
set({
conversationMessages: get().conversationMessages.concat(newMessages),
});
}
}
},
resetConversation: () => {
set({ conversationMessages: [] });
},
});
@gfortaine we actually use @microsoft/fetch-event-source for the playground to do streaming with POST +1
Thank you all for sharing your solutions here! I agree that @smervs solution currently looks like the best option available for the
openai-node
package. Here's a more complete example with proper error handling and no extra dependencies:try { const res = await openai.createCompletion({ model: "text-davinci-002", prompt: "It was the best of times", max_tokens: 100, temperature: 0, stream: true, }, { responseType: 'stream' }); res.data.on('data', data => { const lines = data.toString().split('\n').filter(line => line.trim() !== ''); for (const line of lines) { const message = line.replace(/^data: /, ''); if (message === '[DONE]') { return; // Stream finished } try { const parsed = JSON.parse(message); console.log(parsed.choices[0].text); } catch(error) { console.error('Could not JSON parse stream message', message, error); } } }); } catch (error) { if (error.response?.status) { console.error(error.response.status, error.message); error.response.data.on('data', data => { const message = data.toString(); try { const parsed = JSON.parse(message); console.error('An error occurred during OpenAI request: ', parsed); } catch(error) { console.error('An error occurred during OpenAI request: ', message); } }); } else { console.error('An error occurred during OpenAI request', error); } }
This could probably be refactored into a
streamCompletion
helper function (that uses either callbacks or es6 generators to emit new messages).Apologies there's not an easier way to do this within the SDK itself โ the team will continue evaluating how to get this added natively, despite the lack of support in the current sdk generator tool we're using.
lack
Thank you @schnerd . I don't quite understand the limitations of this solution. It uses @microsoft/fetch-event-source which, as far as I understand, encodes the information in the URL, and it's not doing a POST. The library page mentions a limitation of 2000 characters in most browsers, but it's not clear to me whether this also affects Node.js. I appreciate any clarification on the limitation of this solution.
@rodrigoGA Can you tell the npm package for openai
@justinmahar This is a great solution thanks - I'm testing it on node, however it seems to strip out the triple backticks ``` for when code is generated by the AI response, any way to keep those in by any chance as I use them to format the code. Thanks !
@ckarsan what do we need to pass in const axiosConfig = { // ... }; ?
@juzarantri Think it's just optional parameters i left it out entirely
okay bro
I'm a bit lost as to how to actually use
stream: true
in this library.Example incorrect syntax: