Open rehn123 opened 4 months ago
In this case you probably want to specify the full deployment URL in the backends:
To something like: https://andre-openai.openai.azure.com/openai/deployments/deployment1/
And another backend to the same instance but different deployment:
https://andre-openai.openai.azure.com/openai/deployments/deployment2/
Then they will be treated as different “backends” with their own throttling status. However, you will need to work in this line:
Otherwise, the final URL that the policy will build will be duplicated such as https://andre-openai.openai.azure.com/openai/deployments/gt35/deployments/gt35/chat/completions...
I didn’t stop to think what would be the exact code that can do that, but we need to extract the Host part of the backend URL before setting it (there is probably a C# expression that can do that for us).
If the deployment names are different for multiple instances this returns 404 Resource Not Found error. What are the possible solutions to tackle this problem?