Streaming response doesn't work with Azure models. This can be fixed by handling the JSON chunks differently.
There is an issue with AzureChatOpenAI Streaming first response chunk
response["choices"][0] is causing the issue because the server returns an empty array.
This doesn't occur for 2023-03-15-preview version
During streaming, the first chunk may only contain the name of an OpenAI function and not any arguments. In this case, the current code presumes there is a streaming response and tries to append to it, but gets a KeyError.
This can be fixed by checking if the arguments key exists, and if not, creates a new entry instead of appending.
Example:
llm -m azure 'Tell me a quantum entaglement joke' --system 'You are Albert Einstein' --no-stream
Why don't quantum physicists make jokes about entanglement?
Because when they do, they instantly become the subject of someone else's humor halfway around the world!
llm -m azure 'Tell me a quantum entaglement joke' --system 'You are Albert Einstein'
Streaming response doesn't work with Azure models. This can be fixed by handling the JSON chunks differently.
There is an issue with AzureChatOpenAI Streaming first response chunk response["choices"][0] is causing the issue because the server returns an empty array.
This doesn't occur for 2023-03-15-preview version
During streaming, the first chunk may only contain the name of an OpenAI function and not any arguments. In this case, the current code presumes there is a streaming response and tries to append to it, but gets a KeyError.
This can be fixed by checking if the arguments key exists, and if not, creates a new entry instead of appending.
Example:
Why don't quantum physicists make jokes about entanglement? Because when they do, they instantly become the subject of someone else's humor halfway around the world!
Error: list index out of range
Please refer: https://github.com/jerryjliu/llama_index/issues/7640