Open rohanthacker opened 1 month ago
Thanks for the issue! I think supporting the Azure AI Model Inference API would be great! I think it makes sense to have it as a separate model client that also implements the ChatCompletionClient
protocol makes sense.
We'd love if you're interested in helping build this!
@jackgerrits I'll be happy to implement these changes, can this task be assigned to me as I have already started work on this task in my fork of this repository. I'll raise a draft pull request in a day or so for us to discuss.
@rohanthacker - this is supported in dotnet now with https://github.com/microsoft/autogen/pull/3790
Following this issue, since this will enable us to use phi, on some internal cases.
@rohanthacker any update on this one?
@edirgarcia Depending on the API you are using. If you are using the Core API, you don't need to wait for this feature you can use the azure.ai.inference.models
directly in your agent implementation. If you are using the AgentChat API, you may need to wait for the wrapper but you can also implement your own agent.
Hi @ekzhu,
I’ve completed the initial implementation of ChatCompletionClient
using azure.ai.inference
.
This is a draft, as I wanted to discuss the significant code duplication between this client and the existing OpenAIChatCompletionClient. If the team agrees with this approach, I’ll code/copy the rest of the implementation as needed.
Currently, the two clients are nearly identical, with the only differences being the type variations required by each library. Given that azure.ai.inference
is compatible with OpenAI, At the moment I don't see the need to have a separate concrete class for this considering the azure.ai.inference
library is OpenAI compatible. What are your thoughts?
Additionally, I was able to get OpenAIChatCompletionClient to work with models deployed on Azure AI Studio by setting the base_url
and api_key
. However, I encountered a few minor compatibility issues with specific models:
Both models work fine when connected directly to the OpenAI API.
Happy to keep working on this—just looking for team input on code duplication and next steps.
@edirgarcia Phi-3.5 is working with the OpenAIChatCompletionClient
. Please refer this example I came across today by azureml-examples
@edirgarcia Phi-3.5 is working with the
OpenAIChatCompletionClient
. Please refer this example I came across today by azureml-examples
Thank you I will test this out next week.
Currently, the two clients are nearly identical, with the only differences being the type variations required by each library. Given that
azure.ai.inference
is compatible with OpenAI, At the moment I don't see the need to have a separate concrete class for this considering theazure.ai.inference
library is OpenAI compatible. What are your thoughts?
Thanks @rohanthacker!
For the Core API, the user can choose any client they want to use. So this is not a blocker.
I think there is still benefit of wrapping the azure.ai.inference
client behind our ChatCompletionClient
protocol. For AgentChat users it is useful as the built-in agents accepts a ChatCompletionClient
. We can resolve the code duplication in the future.
What feature would you like to be added?
I would like the ability to use a model that is deployed on Azure AI Studio and uses the Azure AI Model Inference API.
If needed, I would like to assist in the creation of the feature. However I have a few questions and require some help about what would be the best way to implement this feature.
Questions:
AzureOpenAIChatCompletionClient
?I have already tried to do this however the API produces an invalid URL and responds with a 404 error, as the endpoint created by Azure AI Studio and the client are not the same.
Looking forward to discussing more on this
Why is this needed?
Azure AI Studio provides a large catalog of models along with various deployment options that make it easy for developers to access a wide variety of models. Given the nature of this project, having the ability to integrate this diverse set of models out of the box will allow for more adoption of the project and allow developers to bring their own model in without the need to code a new client for each.