Open spsingh559 opened 3 months ago
I have also experienced that ChatOpenAI (and consequently AzureChatOpenAI) ignores model_kwargs={"stream":False} and still use streaming under the hood when calling OpenAI API.
The solution in my scenario was to turn of the parallel_tool_calls option using the statement:
llm.bind(parallel_tool_calls=False)
Looks like parallel_tool_calls option requires the stream option in the OpenAI API layer and this (and possibly some other) option dependency can override your intention to turn of the stream option.
Checked other resources
Example Code
llm = AzureChatOpenAI( api_key=OPENAI_KEY, azure_endpoint=OPENAI_URL, openai_api_version=openai_api_version, # type: ignore azure_deployment=azure_deployment, temperature=0.5, verbose=True, model_kwargs={"stream":False} # {"top_p": 0.1}