Open mve opened 6 months ago
To address the issue where not specifying a model when creating an instance of AzureOpenAI
leads to unexpected behavior due to a mismatch in context size, it's crucial to explicitly specify the model name during instantiation. This ensures that the context size matches the model configured in your Azure deployment, preventing errors like the one you encountered. Here's how you can specify the model name:
from llama_index.llms.azure_openai import AzureOpenAI
azure_engine = AzureOpenAI(
model="<your-model-name>", # Replace <your-model-name> with your actual model name
engine="<your-deployment-name>",
# other parameters as needed
)
This approach ensures that the AzureOpenAI
class uses the correct model, aligning the context size with the model's capabilities and avoiding the ValueError
related to token limits.
For dynamic handling based on the model's context size, you can use a function to determine the model's maximum token limit and adjust your application's behavior accordingly. This involves checking the input against the model's token limit before proceeding with operations that might exceed this limit. Here's a simplified example to illustrate this concept:
def get_model_context_size(model_name):
# Logic to determine the model's context size
# Placeholder for actual implementation
def ensure_input_within_limit(input_text, model_name):
context_size = get_model_context_size(model_name)
if context_size is None:
# Handle error
return False
token_count = count_tokens(input_text) # Placeholder for token counting logic
if token_count > context_size:
raise ValueError("Input exceeds the token limit for the specified model.")
return True
This method requires implementing a count_tokens
function tailored to your input text's tokenization needs. By dynamically adjusting to the model's token limit, you can prevent errors related to exceeding the model's context size.
Implementing these strategies ensures that your use of the AzureOpenAI
class is aligned with the specific model you've configured in Azure, mitigating unexpected behavior and errors.
To continue the conversation, mention @dosu.
It might be helpful to change the comment to something like:
- model: Name of the model (e.g. `text-davinci-003`)
This is used to determine the context length, whether it's a chat model, and whether it supports function calling.
Additional, I think it's challenging for users to correctly get the full model name as Azure splits it into two fields model name
and model version
. All of the models listed in Azure docs lack standardized and unified names, which might make users confused.
For example, when viewing the table as below (see GPT-3.5 models for the full table):
Model ID | Max Request (tokens) | Training Data (up to) |
---|---|---|
gpt-35-turbo (0301) | 4,096 | Sep 2021 |
gpt-35-turbo (0613) | 4,096 | Sep 2021 |
... | ... | ... |
gpt-35-turbo (1106) | Input: 16,385 Output: 4,096 |
Sep 2021 |
I'm not sure which model the default name "gpt-35-turbo"
refers to. Likewise, I am uncertain which name, "gpt-35-turbor"
or "gpt-35-turbo-1106"
, should be used if I want to use model gpt-35-turbo (1106)
.
I think it might be easier and more intuitive if we can just configure the settings such as context_window
, is_chat_model
and is_function_calling_model
directly.
To address the concerns about model name mismatches and their impact on context size and other settings when using AzureOpenAI with the LlamaIndex framework, it's crucial to ensure that the engine
parameter accurately reflects your Azure OpenAI deployment name. This parameter is key to aligning the model's behavior with your expectations, particularly regarding context size and endpoint selection.
When initializing the AzureOpenAIMultiModal
class, you should specify both the model
and engine
parameters. The model
parameter is used to decide whether to use the completion or chat endpoint, while the engine
parameter should match your Azure OpenAI deployment name to ensure correct model behavior:
from llama_index.multi_modal_llms.azure_openai.base import AzureOpenAIMultiModal
llm = AzureOpenAIMultiModal(
model="YOUR_MODEL_NAME", # Determines completion vs. chat endpoint.
engine="YOUR_AZURE_OPENAI_DEPLOYMENT_NAME", # Should match your Azure deployment.
api_key=aoai_api_key,
azure_endpoint=aoai_endpoint,
api_version=aoai_api_version,
use_azure_ad=False, # For Microsoft Entra ID token authentication.
)
This setup helps prevent issues like the ValueError: Initial token count exceeds token limit
error, which occurs when the input exceeds the model's token limit. It's also worth noting that direct configuration of settings such as context_window
, is_chat_model
, and is_function_calling_model
is possible and can be adjusted to fit specific use cases within the LlamaIndex framework.
By ensuring the engine
parameter correctly matches your Azure deployment and adjusting settings directly as needed, you can mitigate potential mismatches and utilize AzureOpenAI more effectively within the LlamaIndex framework.
To continue the conversation, mention @dosu.
Yes the naming of models in Azure can be a bit tricky. But at the same time, wouldn't it be even more tedious for people to find the correct context windows for different models? Also finding the correct context window would still involve users figuring out what model they are using. Also not sure if changing the approach for AzureOpenAI vs OpenAI is a good idea, would make those 2 implementations work quite different right?
For those reasons I personally think it would be better to stick with using the model name. Just needs to be documented better and i.m.o. the default model value needs to be removed to prevent accidentally selecting the wrong model.
Feel free to make a PR to the azure class @mve
Bug Description
When no model name is specified when creating an instance of AzureOpenAI a default is chosen which may differ from the model configured in the deployment in Azure. This can cause unexpected behaviour as the context size is chosen based on the model name.
I think model should not have a default value and if non is entered it should show a clear error warning that the model must be provided and that is must be the same name as the one selected in the Azure deployment.
Also inside the class there is the following comment:
I believe this is not accurate as it is also used for the context size.
Version
0.10.31
Steps to Reproduce
Create an instance of AzureOpenAI like so:
If no model is specified the default of "gpt-35-turbo" is used. If your deployment uses a different model, this can cause unexpected behaviour as the context size will be set to 4096 even though you may have selected a model in Azure that has a larger context size.
I think this context size gets chosen based on the model name in this code:
Relevant Logs/Tracbacks