[Feature Request]: add azure model to llama_index

szcc commented 1 month ago

Feature Description

Hi @dosu and @logan-markewich ,

We have 'azure' model when I put our model 'azure-gpt-4o' llm = OpenAI(temperature=0.1, model="azure-gpt-4o"). I got error: ValueError: Unknown model 'azure-gpt-4'. Please provide a valid OpenAI model name in: gpt-4, gpt-4-32k, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-vision-preview, gpt-4-1106-vision-preview, gpt-4-turbo-2024-04-09, gpt-4-turbo, gpt-4o, gpt-4o-2024-05-13, gpt-4o-2024-08-06, gpt-4o-mini, gpt-4o-mini-2024-07-18, gpt-4-0613, gpt-4-32k-0613, gpt-4-0314, gpt-4-32k-0314, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0125, gpt-3.5-turbo-1106, gpt-3.5-turbo-0613, gpt-3.5-turbo-16k-0613, gpt-3.5-turbo-0301, text-davinci-003, text-davinci-002, gpt-3.5-turbo-instruct, text-ada-001, text-babbage-001, text-curie-001, ada, babbage, curie, davinci, gpt-35-turbo-16k, gpt-35-turbo, gpt-35-turbo-0125, gpt-35-turbo-1106, gpt-35-turbo-0613, gpt-35-turbo-16k-0613

Do you think if ift is a possible for you to add models like : azure-gpt-4o azure-gpt-4-turbo-20240409 azure-gpt-4-turbo-preview azure-gpt-4 text-embedding-ada-002

to llama_index_cord?

Reason

No response

Value of Feature

No response

dosubot[bot] commented 1 month ago

I found a similar open issue regarding adding support for the gpt-4o-mini model, which was resolved by updating the package and restarting the notebook ^[1]. Additionally, there are several closed issues where users requested support for various models, and the solution was to update the package and use the correct model names ^[2]^[3].

To add support for the models azure-gpt-4o, azure-gpt-4-turbo-20240409, azure-gpt-4-turbo-preview, azure-gpt-4, and text-embedding-ada-002 to llama_index_cord, you need to update the relevant dictionaries in the utils.py file and ensure the models are recognized in the embedding configurations. Here is the updated code:

GPT4_MODELS: Dict[str, int] = {
    # stable model names:
    #   resolves to gpt-4-0314 before 2023-06-27,
    #   resolves to gpt-4-0613 after
    "gpt-4": 8192,
    "gpt-4-32k": 32768,
    # turbo models (Turbo, JSON mode)
    "gpt-4-1106-preview": 128000,
    "gpt-4-0125-preview": 128000,
    "gpt-4-turbo-preview": 128000,
    # multimodal model
    "gpt-4-vision-preview": 128000,
    "gpt-4-1106-vision-preview": 128000,
    "gpt-4-turbo-2024-04-09": 128000,
    "gpt-4-turbo": 128000,
    "gpt-4o": 128000,
    "gpt-4o-2024-05-13": 128000,
    "gpt-4o-2024-08-06": 128000,
    "gpt-4o-mini": 128000,
    "gpt-4o-mini-2024-07-18": 128000,
    # 0613 models (function calling):
    #   https://openai.com/blog/function-calling-and-other-api-updates
    "gpt-4-0613": 8192,
    "gpt-4-32k-0613": 32768,
    # 0314 models
    "gpt-4-0314": 8192,
    "gpt-4-32k-0314": 32768,
    # New models
    "azure-gpt-4o": 128000,
    "azure-gpt-4-turbo-20240409": 128000,
    "azure-gpt-4-turbo-preview": 128000,
    "azure-gpt-4": 8192,
}

AZURE_TURBO_MODELS: Dict[str, int] = {
    "gpt-4o": 128000,
    "gpt-35-turbo-16k": 16384,
    "gpt-35-turbo": 4096,
    # 0125 (2024) model (JSON mode)
    "gpt-35-turbo-0125": 16384,
    # 1106 model (JSON mode)
    "gpt-35-turbo-1106": 16384,
    # 0613 models (function calling):
    "gpt-35-turbo-0613": 4096,
    "gpt-35-turbo-16k-0613": 16384,
    # New models
    "azure-gpt-4o": 128000,
    "azure-gpt-4-turbo-20240409": 128000,
    "azure-gpt-4-turbo-preview": 128000,
    "azure-gpt-4": 8192,
}

GPT3_MODELS: Dict[str, int] = {
    "text-ada-001": 2049,
    "text-babbage-001": 2040,
    "text-curie-001": 2049,
    "ada": 2049,
    "babbage": 2049,
    "curie": 2049,
    "davinci": 2049,
    # New model
    "text-embedding-ada-002": 2049,
}

ALL_AVAILABLE_MODELS = {
    **O1_MODELS,
    **GPT4_MODELS,
    **TURBO_MODELS,
    **GPT3_5_MODELS,
    **GPT3_MODELS,
    **AZURE_TURBO_MODELS,
}

CHAT_MODELS = {
    **O1_MODELS,
    **GPT4_MODELS,
    **TURBO_MODELS,
    **AZURE_TURBO_MODELS,
}

Additionally, ensure that the text-embedding-ada-002 model is properly integrated into the embedding configurations in llama_index/embeddings/openai/base.py:

class OpenAIEmbeddingModelType(str, Enum):
    """OpenAI embedding model type."""

    DAVINCI = "davinci"
    CURIE = "curie"
    BABBAGE = "babbage"
    ADA = "ada"
    TEXT_EMBED_ADA_002 = "text-embedding-ada-002"
    TEXT_EMBED_3_LARGE = "text-embedding-3-large"
    TEXT_EMBED_3_SMALL = "text-embedding-3-small"

class OpenAIEmbeddingModeModel(str, Enum):
    """OpenAI embedding mode model."""

    # davinci
    TEXT_SIMILARITY_DAVINCI = "text-similarity-davinci-001"
    TEXT_SEARCH_DAVINCI_QUERY = "text-search-davinci-query-001"
    TEXT_SEARCH_DAVINCI_DOC = "text-search-davinci-doc-001"

    # curie
    TEXT_SIMILARITY_CURIE = "text-similarity-curie-001"
    TEXT_SEARCH_CURIE_QUERY = "text-search-curie-query-001"
    TEXT_SEARCH_CURIE_DOC = "text-search-curie-doc-001"

    # babbage
    TEXT_SIMILARITY_BABBAGE = "text-similarity-babbage-001"
    TEXT_SEARCH_BABBAGE_QUERY = "text-search-babbage-query-001"
    TEXT_SEARCH_BABBAGE_DOC = "text-search-babbage-doc-001"

    # ada
    TEXT_SIMILARITY_ADA = "text-similarity-ada-001"
    TEXT_SEARCH_ADA_QUERY = "text-search-ada-query-001"
    TEXT_SEARCH_ADA_DOC = "text-search-ada-doc-001"

    # text-embedding-ada-002
    TEXT_EMBED_ADA_002 = "text-embedding-ada-002"

    # text-embedding-3-large
    TEXT_EMBED_3_LARGE = "text-embedding-3-large"

    # text-embedding-3-small
    TEXT_EMBED_3_SMALL = "text-embedding-3-small"

This code adds the specified models to the GPT4_MODELS, AZURE_TURBO_MODELS, and GPT3_MODELS dictionaries, and ensures they are included in the ALL_AVAILABLE_MODELS and CHAT_MODELS dictionaries. It also integrates the text-embedding-ada-002 model into the embedding configurations ^[4]^[5].

To continue talking to Dosu, mention @dosu.

nightosong commented 1 month ago

if models deployed on azure, you can try:

from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
from llama_index.llms.azure_openai import AzureOpenAI

llm = AzureOpenAI(
    api_key="",
    api_version="",
    azure_endpoint="",
    azure_deployment=""
)

szcc commented 1 month ago

I did not see llama_index.embeddings.azure and llama_index.llms.azure. my llama_index version Name: llama-index Version: 0.11.10

logan-markewich commented 1 month ago

@szcc These model names are not listed in azures documentation https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models

Should specify gpt-4o or gpt-4, etc. like the error states

szcc commented 1 month ago

Our private LLM name azure-gpt-4o etc. My key is associated with our base url endpoint. if I use gpt-4o then I'll get invalid key error. I am work on adding my model name to the util files as @dosu suggested. Fingers crossed.

dosubot[bot] commented 1 month ago

To add support for your custom Azure OpenAI models like azure-gpt-4o to llama_index, you can extend the AzureOpenAIMultiModal class and specify the custom model name in the model parameter. Here is an example of how you can do this:

from llama_index.legacy.multi_modal_llms.azure_openai import AzureOpenAIMultiModal

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

# Example usage
aoai_api_key = "YOUR_AZURE_OPENAI_API_KEY"
aoai_endpoint = "YOUR_AZURE_OPENAI_ENDPOINT"
aoai_api_version = "2023-07-01-preview"

llm = CustomAzureOpenAI(
    deployment_name="AZURE_AZURE_OPENAI_DEPLOYMENT_NAME",
    api_key=aoai_api_key,
    azure_endpoint=aoai_endpoint,
    api_version=aoai_api_version,
)

Additionally, ensure you have the necessary packages installed:

%pip install llama-index-embeddings-azure-openai
%pip install llama-index-llms-azure-openai

This code defines a new class CustomAzureOpenAI that inherits from AzureOpenAIMultiModal and sets the model parameter to azure-gpt-4o. You can then use this class to create an instance of the custom model. Make sure to also deploy your own embedding model and pass the deployment names as parameters when initializing AzureOpenAIMultiModal ^[1]^[2]^[3].

If your private LLM name is azure-gpt-4o and using gpt-4o results in an invalid key error, updating the utility files as suggested should help. Ensure that your custom model names are correctly added to the relevant dictionaries in utils.py and that they are recognized in the embedding configurations.

Additionally, make sure you have the latest version of llama_index installed. If you still encounter issues, you might need to check if there are any specific configurations or additional parameters required by your private deployment on Azure. Good luck!

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

I use CustomAzureOpenAI and got error:AttributeError: 'CustomAzureOpenAI' object has no attribute 'predict'

szcc commented 1 month ago

@dosu I updated the util.py file under python3.8/site-packages/llama_index/llms/openai/util.py and made sure that the text-embedding-ada-002 model was added to the embedding configurations in python3.8/site-packages/llama_index/embeddings/openai/base.py, as you suggested. It works now. Thanks a lot! Should I check in the modified files to GitHub? Otherwise, each time I update to a new version of llama_index, my updates will be erased.

dosubot[bot] commented 1 month ago

To resolve the AttributeError: 'CustomAzureOpenAI' object has no attribute 'predict', you need to ensure that the CustomAzureOpenAI class (which seems to be a subclass of AzureOpenAIMultiModal) has a predict method implemented. Based on the provided code, the AzureOpenAIMultiModal class does not have a predict method. You will need to add this method to the class.

Here is an example of how you might add a predict method to the AzureOpenAIMultiModal class:

class AzureOpenAIMultiModal(OpenAIMultiModal):
    # ... existing code ...

    def predict(self, prompt: str) -> str:
        client = self._get_client()
        response = client.Completion.create(
            engine=self.engine,
            prompt=prompt,
            max_tokens=self.max_new_tokens,
            temperature=self.temperature,
            **self.additional_kwargs
        )
        return response.choices[0].text

    # ... existing code ...

This predict method uses the Completion.create method of the SyncAzureOpenAI client to generate a response based on the provided prompt. Adjust the parameters and method calls as necessary to fit your specific use case and the capabilities of the SyncAzureOpenAI client ^[1].

Regarding your question about checking in the modified files to GitHub, it is generally a good practice to contribute your changes back to the repository if they are useful for others. You can create a pull request with your changes. This way, your updates can be reviewed and potentially merged into the main codebase, ensuring that they persist across future updates.

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

I'll try to add predict () into AzureOpenAIMultiModal class. Thank you! Also I may request a pull on my change since "azure-gpt-4o" is widely used by others. Thanks @dosu

dosubot[bot] commented 1 month ago

You're welcome! Adding the predict method to the AzureOpenAIMultiModal class should help resolve the issue. Contributing your changes via a pull request is a great idea, especially if the "azure-gpt-4o" model is widely used. This way, others can benefit from your work, and it ensures your updates are maintained in future versions. Good luck, and feel free to reach out if you need further assistance!

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

hi @dosu , I added your code " def predict(self, prompt: str) -> str: ..." Here is how I call it: db = SQLDatabase.from_uri(mydb_uri) query_engine = NLSQLTableQueryEngine(
sql_database=db, llm=llm
) query_str = "Which city has the highest population?" response = query_engine.query(query_str) I got error: NLSQLRetriever.retrieve_with_metadata(self, str_or_query_bundle) 298 if self._verbose: 299 print(f"> Table desc str: {table_desc_str}") --> 301 response_str = self._llm.predict( 302 self._text_to_sql_prompt, 303 query_str=query_bundle.query_str, 304 schema=table_desc_str, 305 dialect=self._sql_database.dialect, 306 ) 308 sql_query_str = self._sql_parser.parse_response_to_sql( 309 response_str, query_bundle 310 ) 311 # assume that it's a valid SQL query

TypeError: predict() got an unexpected keyword argument 'query_str'

dosubot[bot] commented 1 month ago

The error you're encountering, TypeError: predict() got an unexpected keyword argument 'query_str', indicates that the predict method in your CustomAzureOpenAI class does not accept the keyword arguments being passed to it.

To resolve this, you need to modify the predict method in your CustomAzureOpenAI class to accept the necessary keyword arguments. Here is an updated version of the predict method:

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

    def predict(self, prompt_template, query_str, schema, dialect, **kwargs) -> str:
        client = self._get_client()
        prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect)
        response = client.Completion.create(
            engine=self.engine,
            prompt=prompt,
            max_tokens=self.max_new_tokens,
            temperature=self.temperature,
            **self.additional_kwargs
        )
        return response.choices[0].text

This method now accepts prompt_template, query_str, schema, and dialect as parameters, formats the prompt accordingly, and then calls the Completion.create method of the SyncAzureOpenAI client.

With this change, your NLSQLTableQueryEngine should be able to call the predict method without encountering the TypeError.

Here is how you can use it:

db = SQLDatabase.from_uri(mydb_uri)
llm = CustomAzureOpenAI(
    api_key="YOUR_AZURE_OPENAI_API_KEY",
    azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
    api_version="2023-07-01-preview",
    deployment_name="AZURE_AZURE_OPENAI_DEPLOYMENT_NAME"
)
query_engine = NLSQLTableQueryEngine(
    sql_database=db,
    llm=llm
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

This should resolve the issue and allow you to execute your query successfully.

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

Hi @dosu

I updated the predict method and error out on client = self._get_client() AttributeError: 'CustomAzureOpenAI' object has no attribute '_get_client'

I check the class AzureOpenAIMultiModal(OpenAIMultiModal) and found there is _get_clients so I used '_get_clients' instead. But still got error out by :prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect) --> 164 response = client.Completion.create( 165 engine=self.engine, 166 prompt=prompt, 167 max_tokens=self.max_new_tokens, 168 temperature=self.temperature, 169 **self.additional_kwargs 170 ) 171 return response.choices[0].text

AttributeError: 'tuple' object has no attribute 'Completion'

dosubot[bot] commented 1 month ago

The error AttributeError: 'tuple' object has no attribute 'Completion' indicates that the _get_clients method is returning a tuple, and you are trying to call Completion.create on this tuple instead of the correct client object.

To fix this, you need to ensure that you are using the correct client object from the tuple returned by _get_clients. Typically, _get_clients returns both synchronous and asynchronous clients. You should use the synchronous client for your predict method.

Here is the updated predict method:

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

    def predict(self, prompt_template, query_str, schema, dialect, **kwargs) -> str:
        sync_client, _ = self._get_clients()  # Get the synchronous client
        prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect)
        response = sync_client.Completion.create(
            engine=self.engine,
            prompt=prompt,
            max_tokens=self.max_new_tokens,
            temperature=self.temperature,
            **self.additional_kwargs
        )
        return response.choices[0].text

This method now correctly uses the synchronous client from the tuple returned by _get_clients.

Here is how you can use it:

db = SQLDatabase.from_uri(mydb_uri)
llm = CustomAzureOpenAI(
    api_key="YOUR_AZURE_OPENAI_API_KEY",
    azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
    api_version="2023-07-01-preview",
    deployment_name="AZURE_AZURE_OPENAI_DEPLOYMENT_NAME"
)
query_engine = NLSQLTableQueryEngine(
    sql_database=db,
    llm=llm
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

This should resolve the issue and allow you to execute your query successfully.

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

Hi @dosu Still not working... AzureOpenAIMultiModal.predict(self, prompt_template, query_str, schema, dialect, kwargs) 162 syncclient, = self._get_clients() # Get the synchronous client 163 prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect) --> 164 response = sync_client.Completion.create( 165 engine=self.engine, 166 prompt=prompt, 167 max_tokens=self.max_new_tokens, 168 temperature=self.temperature, 169 self.additional_kwargs 170 ) 171 return response.choices[0].text

AttributeError: 'AzureOpenAI' object has no attribute 'Completion'

dosubot[bot] commented 1 month ago

The error AttributeError: 'AzureOpenAI' object has no attribute 'Completion' indicates that the AzureOpenAI client does not have a Completion attribute. Instead, you should use the Completions.create method.

Here is the corrected predict method:

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

    def predict(self, prompt_template, query_str, schema, dialect, **kwargs) -> str:
        sync_client, _ = self._get_clients()  # Get the synchronous client
        prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect)
        response = sync_client.Completions.create(
            engine=self.engine,
            prompt=prompt,
            max_tokens=self.max_new_tokens,
            temperature=self.temperature,
            **self.additional_kwargs
        )
        return response.choices[0].text

This method now correctly uses the Completions.create method of the SyncAzureOpenAI client.

Here is how you can use it:

db = SQLDatabase.from_uri(mydb_uri)
llm = CustomAzureOpenAI(
    api_key="YOUR_AZURE_OPENAI_API_KEY",
    azure_endpoint="YOUR_AZURE_OPENAI_ENDPOINT",
    api_version="2023-07-01-preview",
    deployment_name="AZURE_AZURE_OPENAI_DEPLOYMENT_NAME"
)
query_engine = NLSQLTableQueryEngine(
    sql_database=db,
    llm=llm
)
query_str = "Which city has the highest population?"
response = query_engine.query(query_str)

This should resolve the issue and allow you to execute your query successfully ^[1].

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

hmmm @dosu you provide a same code on predict() for the pass two conversations. So I got same error...

dosubot[bot] commented 1 month ago

To correctly implement the predict method in the CustomAzureOpenAI class and avoid the AttributeError: 'AzureOpenAI' object has no attribute 'Completion' error, you need to ensure that you are correctly accessing the chat.completions.create method. Here is an example of how you can implement the predict method:

from llama_index.llms.azure_openai import AzureOpenAI

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

    def predict(self, prompt_template, query_str, schema, dialect, **kwargs) -> str:
        sync_client, _ = self._get_clients()  # Get the synchronous client
        prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect)
        response = sync_client.chat.completions.create(
            messages=[{"role": "user", "content": prompt}]
        )
        return response.choices[0].message.content

This implementation ensures that the AzureOpenAI class has a _get_clients method that returns a client instance with a chat attribute. The completions.create method is used to generate a completion for the given prompt. The response is then accessed to get the content of the first choice's message ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

szcc commented 1 month ago

@dosu still error out on this line: response = sync_client.chat.completions.create( 164 messages=[{"role": "user", "content": prompt}] 165 ) else: 272 msg = f"Missing required argument: {quote(missing[0])}" --> 273 raise TypeError(msg) 274 return func(*args, **kwargs)

TypeError: Missing required arguments; Expected either ('messages' and 'model') or ('messages', 'model' and 'stream') arguments to be given

dosubot[bot] commented 1 month ago

To resolve the TypeError: Missing required arguments; Expected either ('messages' and 'model') or ('messages', 'model' and 'stream') arguments to be given error, you need to ensure that both the messages and model arguments are passed to the create method. Here is the corrected predict method:

from llama_index.llms.azure_openai import AzureOpenAI

class CustomAzureOpenAI(AzureOpenAIMultiModal):
    def __init__(self, **kwargs):
        super().__init__(model="azure-gpt-4o", **kwargs)

    def predict(self, prompt_template, query_str, schema, dialect, **kwargs) -> str:
        sync_client, _ = self._get_clients()  # Get the synchronous client
        prompt = prompt_template.format(query_str=query_str, schema=schema, dialect=dialect)
        response = sync_client.chat.completions.create(
            messages=[{"role": "user", "content": prompt}],
            model=self.model,  # Ensure the model argument is included
            stream=False,
            **self.additional_kwargs
        )
        return response.choices[0].message.content

This implementation ensures that the model argument is explicitly passed to the create method along with the messages argument. This should resolve the TypeError ^[1]^[2]^[3].

To continue talking to Dosu, mention @dosu.

run-llama / llama_index