Open ChristianWeyer opened 4 months ago
Can you describe the solution you like. Especially how will phi-3 support different from supporting other open source llm like llama
The support for Mini, Small, and Medium may just be like supporting other non-OAI models. Support for Vision should be explicit, like we currently have e.g. for Llava.
LLaVA should just work in ollama client and so should the phi-3 vision model? @BeibinLi might know better here...
Yes, phi models are already supported through Ollama and LM Studio.
The Phi-3-vision was just release a few days ago but should also be supported soon.
LLaVA should just work in ollama client and so should the phi-3 vision model? @BeibinLi might know better here...
Do we have an Ollama client in AutioGen or are you referring to the OAI compatibility of Ollama's API?
Ollama does not (yet) provide OAI Tool Calling. For this, we need to use something like LiteLLM, BTW.
LLaVA should just work in ollama client and so should the phi-3 vision model? @BeibinLi might know better here...
Do we have an Ollama client in AutioGen or are you referring to the OAI compatibility of Ollama's API?
Ollama does not (yet) provide OAI Tool Calling. For this, we need to use something like LiteLLM, BTW.
The quality of the quants available via Ollama for Llava are just not really good, I have to say.
Looking forward to seeing Phi-3-Vision in a very high quality quant doing the things we see in the online demos.
I was thinking the same thing. It would be nice if autogen studio had an update where it defaults to a local llm like phi.
@ChristianWeyer @HansUXdev Can you provide more information? Do you want to run Phi-3 locally or remotely in Azure?
There are two different approach:
However, we are aware Azure endpoint is slightly different. See Azure AI Studio, and AutoGen hasn't supported it fully yet.
Let me know how you want to access Phi-3, and I can brainstorm with the Phi team together to offer a better solution.
Either Locally with lm studio running, or if the phi team wants to be generous, they could just do just host for free like groq.
Personally, I'm thinking from ux / dev experience of using autogen studio. So for example right now we have to configure the model, etc. But if you check out the vs code extension "Continue", they watch your lm studio settings, so when you switch from phi to whatever, it just works. It would be really cool to see that here.
However, we are aware Azure endpoint is slightly different. See Azure AI Studio, and AutoGen hasn't supported it fully yet.
Maybe off-topic, but relevant to the above quote. Do you support models from Azure endpoint (serverless option) natively? Or do we have to implement something to use those models?
I am trying to use models hosted there (not phi-3 specific) but let's say it's not a smooth sale as I was expected.
@mpalaourg Can you provide more details regarding Azure serverless option? I did not see it.
The closet information I can find is "Phi-3 in models as a service Phi-3 models are available with pay-as-you-go billing via inference APIs. We will provide additional pricing details at a later date." in here
@mpalaourg Can you provide more details regarding Azure serverless option? I did not see it.
The closet information I can find is "Phi-3 in models as a service Phi-3 models are available with pay-as-you-go billing via inference APIs. We will provide additional pricing details at a later date." in here
Yeah my bad. I knew that I was off-topic here, but I saw you spoke about the Azure endpoint.
I am not trying to use phi-3 at this point. To be specific I have deployed a llama-2, a cohere and a Mistral model in the serverless option. The only difference between them is the api_key and base_url, and because of this I thought that it would be plug and play in the llm_config. I could open a new ticket with my findings and tag you there, I don't want to derail this conversation.
@mpalaourg I am not familiar with the Azure serverless option. However, if you can interact with the endpoint using an API key and base_url in the same way as you do with OpenAI, then you can directly include them in the llm_config for AutoGen. It works similarly to LM Studio, Ollama, and the Azure ML endpoint.
I know it's been a while for this thread, but I am hoping that the case can be made to integrate Phi-3 models via onnxruntime-genai.
LMStudio uses a UI and I'm not sure that's a good scenario for server-based option. I get it has a CLI and an API endpoint. Ollam is a better option than LMStudio but I am sure this depends on the requirement.
That said, both have dependencies that may not be an option as it can make the deployment bloated. Another scenario is deploying to mobile or other devices and require the Language Model to be local.
I hope this is clear on the need and I hope that it becomes supported.
Is your feature request related to a problem? Please describe.
Now that we have the Phi-3 SLM flagship family (including vision) from Microsoft, it would make more than sense to officially and fully integrate it into AutoGen. This would be a strong statement in the market.
Describe the solution you'd like
Full integration of Phi-3 Mini, Small, Medium and especially Vision.
Additional context
No response