Closed kobotschick closed 1 month ago
@wizenheimer, @tylermaran, this looks a very useful utility, any plans on this? I am happy to contribute on python sdk.
I think we should start with modifying the code to use openai-python sdk so that instead of passing openai key to the zerox constructor, we can pass the relevant client (OpenAI or AzureOpenAI) from openai python sdk which would replace the existing manual api calls.
Let me know, what you think.
Hey @pradhyumna85
I think we should start with modifying the code to use openai-python sdk so that instead of passing openai key to the zerox constructor
Having an AsyncOpenAI based client and supporting Batch API would be a great addition, imo.
import os
import asyncio
from openai import AsyncOpenAI
client = AsyncOpenAI(
# This is the default and can be omitted
api_key=os.environ.get("OPENAI_API_KEY"),
)
async def main() -> None:
chat_completion = await client.chat.completions.create(
messages=[
{
"role": "user",
"content": "Say this is a test",
}
],
model="gpt-3.5-turbo",
)
asyncio.run(main())
we can pass the relevant client (OpenAI or AzureOpenAI) from openai python sdk which would replace the existing manual api calls.
Great, might need to introduce a provider component to make LLDs simpler. There could be implications on the package's external api interface.
Agreed, I'd like to add Azure model support here.
Do you think the best approach is just adding the OpenAI SDK for both packages? I think the GPT models are clearly the right choice right now for now, but I wouldn't be surprised if anthropic or gemini models ended up giving similar performance over the next few months.
Wondering if we might want to have the models a bit more abstracted. In general I think adding the OpenAI sdk is a good starting point.
Agreed. Here's a v0 draft of how we could shape it structurally. Need a couple of iteration to get this method signatures right.
Approach 1: Reference Code
classDiagram
class LLMInterface {
<<abstract>>
+run(prompt: Dict, temperature: float, max_tokens: int, image: str) str
}
class LLMFactory {
-model: str
-client: Any
+__init__(model: str)
+run(prompt: Dict, temperature: float, max_tokens: int, image: str) str
-_llm_response(prompt: Dict, temperature: float, max_tokens: int, image: str) str
}
LLMInterface <|-- LLMFactory
class OpenAI
class Anthropic
class GoogleGenAI
class CohereClient
class AzureChatOpenAI
class Bedrock
LLMFactory --> OpenAI : uses
LLMFactory --> Anthropic : uses
LLMFactory --> GoogleGenAI : uses
LLMFactory --> CohereClient : uses
LLMFactory --> AzureChatOpenAI : uses
LLMFactory --> Bedrock : uses
note for LLMFactory "Supports multiple LLM providers:\n- OpenAI (GPT models)\n- Anthropic (Claude models)\n- Google (Gemini models)\n- Cohere\n- Azure OpenAI\n- AWS Bedrock\n- VLLM endpoints\n\nNow with optional image input\nfor multimodal models"
Approach 2: Reference Code
classDiagram
class ABC
<<interface>> ABC
class AbstractLlmService {
<<abstract>>
+embeddings(text: str) list
+chat_completion(messages, model, **kwargs) str
+chat_completion_json(messages, model, **kwargs) str
+json_completion(messages, model, **kwargs)
+image_analysis(image: str, prompt: str, model, **kwargs) str
+multimodal_completion(images: List[str], prompt: str, model, **kwargs) str
}
ABC <|-- AbstractLlmService
note for AbstractLlmService "Abstract base class for\nLLM service providers\nwith image processing capabilities"
@wizenheimer, @tylermaran instead of building our own classes for different providers, I would say it would be better to use LiteLLM (https://github.com/BerriAI/litellm) as it supports almost all popular providers using its single homogeneous api.
What do you think?
That's an interesting take, sounds good 🚀
@kobotschick I have raised a PR https://github.com/getomni-ai/zerox/pull/21 which is not merged yet but you can go ahead and test it, it works now.
Install python package: pip install git+https://github.com/pradhyumna85/zerox.git@multi-provider-support-pysdk
Follow this readme example: here
I would be great if the package supports Azure OpenAI models