Open asliiNorderia opened 8 months ago
I think this is pretty normal as in the chatGPT using the gpt 4 model proves to be way slower due to the fact that it is a heavier to infer model
That's correct, GPT-4 is expected to be slower.
In addition, I assume you're using Pay-as-you-go pricing tier, which doesn't come with any latency guarantees. For latency assurance, Azure recommends PTUs: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/provisioned-throughput Those can be expensive, however, since you need to pre-reserve a bunch of capacity.
The other approach I've heard is to use openai.com OpenAI instead of Azure OpenAI. That may be slightly faster due to the lack of the content safety filter service and other protections (but then you lose those protections, plus other aspects of Azure reliability and security).
You could also try some prompt engineering or few-shot prompting to improve the quality of the responses for gpt-3.5.
You can also try GPT-4 across different regions to measure what latency you see, and see if any regions are responding faster than others.
This issue is for a: (mark with an
x
)Minimal steps to reproduce
Any log messages given by the failure
Expected/desired behavior
OS and Version?
azd version?
Versions
Mention any other details that might be useful
After changing theGPT Model from GPT3.5 to GPT4, the performance is decreased considerably.