Closed stippi2 closed 1 year ago
This is just to confirm that my version of the reducer is not correct. I just saw an instance where strings were concatenated that should not have been...
We should probably just delete this example; any reason not to use .stream()
or .runTools
instead?
If it's easy to use and I should use it, then the only reason that I don't is that I wasn't aware of it. Is it documented somewhere? I was very glad to find the original examples, after stumbling around various search hits showing outdated or incomplete information. I usually find the OpenAI API docs pretty well done, but is there also a place where the TS client in particular are documented with regards to their convenience functions?
Never mind. I see there is lots of documentation and examples directly in the README.md. No idea how I missed that.
Thanks for the feedback @stippi2 – glad you found it, but we'll look into making this more prominent in the docs!
Could you share more about how these were useful to you, so we can understand how to prioritize taking them out of beta?
For some reason, I always referred to the platform.openai.com docs. The streaming subject is touched there very briefly, and nothing about using it in combination with function calling. My assumption was that it is simply all so new right now. No idea why it didn't occur to me to see if this client library has its own docs.
Last week I had some spare time to finally build a voice activated assistant which runs in the Browser. The code can be found here. It became pretty clear early on that I needed to use streaming, or else responses would be far too much delayed. Then I wanted to also enable function calling and struggled to find docs on using it with streaming.
To be honest, your beta API has everything I needed. It worked like a charm. Thanks so much for providing such great software!
Thanks!
Cc @logankilpatrick
@stippi2 from looking at your code, it seems that it'd be quite helpful if we published docs on how to directly stream audio from .speech()
to play from the browser, is that right? (I tried to play with this yesterday and it seemed very hard to find ways to do so)
I think if you just want to play the generated audio, it isn't so hard, especially since ChatGPT will just tell you how. It was a bit tricky to get the audio to play before a longer text message finished streaming. My next challenge will be to allow to interrupt the speech output via voice...
Ah sorry, yes, I meant "play as soon as the stream starts, rather than waiting til it ends". Did you see how to do that?
Your sentence-based workaround seemed clever, but still I imagine that it'd be nicer to not have to do that for a responsive experience, and just have a single request which you can .abort()
once the user tries to interrupt.
If we can assume that you either want a spoken reply or a written one, then it would be nice to be able to specify that as part of the API. Not sure if it is technically possible for OpenAI to have the GPT model's response available much faster on the server side so that it is feasible to stream it into to TTS model and have everything with reduced latency streamed to the client. It seemed to me that the text is already streamed as fast as possible to the client, and there isn't much you can do on the server side. But of course it would be convenient to have an audio response streamed instead of the text, and then get the final message as soon as the audio finished. (Or was interrupted.) Is that what you mean?
Yes, that would be an awesome API feature :)
I just meant that if today, you hit .speech()
with a large block of text, you'd probably like to be able to start playing the audio before the response completes.
But, I see that until the API feature you requested is available, users like yourself will probably still want to do a chunking-by-sentence kind of strategy, so you can start requesting TTS before the LLM is done returning the entire message. So the gain would be pretty minimal compared to what you've done (since the speech endpoint is pretty fast for a single sentence, and the rest can be buffered by the developer as you've done).
@stippi2
I realize this is a bit late, but there seems to be a misunderstanding about the use of .index
in a chunk. I've updated the messageReducer
function, which appears in the library's example, to now handle .tool_calls
.
function messageReducer(previous, item) {
const reduce = (acc, delta) => {
acc = { ...acc };
for (const [key, value] of Object.entries(delta)) {
if (acc[key] === undefined || acc[key] === null) {
acc[key] = value;
if (Array.isArray(acc[key])) {
for (const arr of acc[key]) {
delete arr.index // ChatCompletion's tool_calls do not contain index
}
}
} else if (typeof acc[key] === 'string' && typeof value === 'string') {
(acc[key]) += value;
} else if (typeof acc[key] === 'number' && typeof value === 'number') {
(acc[key]) = value;
} else if (Array.isArray(acc[key]) && Array.isArray(value)) {
const accArray = acc[key];
for (let i = 0; i < value.length; i++) {
const {index, ...chunkTool} = value[i];
if (index - accArray.length > 1) {
throw new Error("array contains empty")
}
accArray[index] = reduce(accArray[index], chunkTool);
}
} else if (typeof acc[key] === 'object' && typeof value === 'object') {
acc[key] = reduce(acc[key], value);
}
}
return acc;
};
return reduce(previous, item.choices[0].delta)
}
test case
describe("Test messageReducer", () => {
it("test init", () => {
let responseMessage = {}
const chunk = {
id: 'chatcmpl-8NaQUv',
object: 'chat.completion.chunk',
created: 1700632258,
model: 'gpt-4-0613',
choices: [
{
index: 0,
delta: {
tool_calls: [ { index: 0, id: 'id0', type: 'function', function: { arguments: '{\n' } } ]
},
finish_reason: null
}
]
}
const y = {
tool_calls: [ { id: 'id0', type: 'function', function: { arguments: '{\n' } } ]
}
responseMessage = messageReducer(responseMessage, chunk)
expect(responseMessage).toEqual(y)
})
it("test reduce into existing message with one tool", () => {
let responseMessage = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'call_fpe',
type: 'function',
function: { name: 'getWeather', arguments: '' }
}
]
}
const chunk = {
id: 'chatcmpl-8NaQUv',
object: 'chat.completion.chunk',
created: 1700632258,
model: 'gpt-4-0613',
choices: [
{
index: 0,
delta: {
tool_calls: [ { index: 0, function: { arguments: '{\n' } } ]
},
finish_reason: null
}
]
}
const y = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'call_fpe',
type: 'function',
function: { name: 'getWeather', arguments: '{\n' }
}
]
}
responseMessage = messageReducer(responseMessage, chunk)
expect(responseMessage).toEqual(y)
})
it("test reduce. add element", () => {
let responseMessage = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'id0',
type: 'function',
function: { name: 'getWeather', arguments: 'abc' }
},
]
}
const chunk = {
id: 'chatcmpl-8NaQUv',
object: 'chat.completion.chunk',
created: 1700632258,
model: 'gpt-4-0613',
choices: [
{
index: 1,
delta: {
tool_calls: [ { index: 1, id: 'id1', type: 'function', function: {name: 'getWeather1', arguments: 'def' } } ]
},
finish_reason: null
}
]
}
const y = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'id0',
type: 'function',
function: { name: 'getWeather', arguments: 'abc' }
},
{
id: 'id1',
type: 'function',
function: { name: 'getWeather1', arguments: 'def' }
}
]
}
responseMessage = messageReducer(responseMessage, chunk)
expect(responseMessage).toEqual(y)
})
it("test reduce with existing two functions", () => {
let responseMessage = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'id0',
type: 'function',
function: { name: 'getWeather', arguments: 'abc' }
},
{
id: 'id1',
type: 'function',
function: { name: 'getWeather1', arguments: 'd' }
}
]
}
const chunk = {
id: 'chatcmpl-8NaQUv',
object: 'chat.completion.chunk',
created: 1700632258,
model: 'gpt-4-0613',
choices: [
{
index: 1,
delta: {
tool_calls: [ { index: 1, function: { arguments: 'ef' } } ]
},
finish_reason: null
}
]
}
const y = {
role: 'assistant',
content: null,
tool_calls: [
{
id: 'id0',
type: 'function',
function: { name: 'getWeather', arguments: 'abc' }
},
{
id: 'id1',
type: 'function',
function: { name: 'getWeather1', arguments: 'def' }
}
]
}
responseMessage = messageReducer(responseMessage, chunk)
expect(responseMessage).toEqual(y)
})
})
I will catch up with the latest library update.
@nwatab would you mind submitting a PR?
@nwatab would you mind submitting a PR?
Sure, I'll do it!
@nwatab thanks for clearing up how it's supposed to work. Will come in handy!
Confirm this is a Node library issue and not an underlying OpenAI API issue
Describe the bug
I have used the
messageReducer
from the examples. After streaming the chunks finished, and I had the assembled message, I saw that thearguments
property was always empty. I am testing the new GPT-4 model which supports multiple function calls. After hours of debugging, I finally realized that the problem was the reducer function, which didn't support arrays. This is how my reducer function currently looks, but I am not sure if that is correct. It would never add additional items from the delta:To Reproduce
Use the new
gpt-4-1106-preview
model which returns atool_calls
array in combination with streaming, while accumulating the message using the reducer function from the examples.Code snippets
No response
OS
macOS
Node version
Node v20.7.0
Library version
openai 4.17.4