Presently, the backends are model agnostic. That means that every model being used by the implementer of this code must reside on every Azure OpenAI instance that is defined in the backend. This could be limiting because it would require a lowest common denominator. Take these backends, for example:
Backend 1 supports model A
Backend 2 supports models A & B
Backend 3 supports model A
Backend 4 supports model B
Backend 5 supports models A & B
Today, the backend pool can only use backends 2 and 5.
If the backend list could take model into consideration, the following would apply per model:
Model A: backends 1, 2, 3, and 5
Model B: backends 2, 4, and 5
I am interested to hear whether there is value in being able to specify backends per model or whether this is a potential solution in search of a problem.
Presently, the backends are model agnostic. That means that every model being used by the implementer of this code must reside on every Azure OpenAI instance that is defined in the backend. This could be limiting because it would require a lowest common denominator. Take these backends, for example:
Today, the backend pool can only use backends 2 and 5.
If the backend list could take model into consideration, the following would apply per model:
I am interested to hear whether there is value in being able to specify backends per model or whether this is a potential solution in search of a problem.