I'm the maintainer of LiteLLM we provide an Open source proxy for load balancing Azure + OpenAI + Any LiteLLM supported LLM
It can process (500+ requests/second)
From this thread it looks like you're trying to load balance between Azure OpenAI instance - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this)
@harishmohanraj
I'm the maintainer of LiteLLM we provide an Open source proxy for load balancing Azure + OpenAI + Any LiteLLM supported LLM It can process (500+ requests/second)
From this thread it looks like you're trying to load balance between Azure OpenAI instance - I hope our solution makes it easier for you. (i'd love feedback if you're trying to do this)
Here's the quick start:
Doc: https://docs.litellm.ai/docs/simple_proxy#load-balancing---multiple-instances-of-1-model
Step 1 Create a Config.yaml
Step 2: Start the litellm proxy:
Step3 Make Request to LiteLLM proxy: