Sharing the learning along the way we been gathering to enable Azure OpenAI at enterprise scale in a secure manner. GPT-RAG core is a Retrieval-Augmented Generation pattern running in Azure, using Azure Cognitive Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences.
Measuring performance and latency is crucial to ensure a timely and frictionless user experience. It's important to assess these factors to guarantee prompt and smooth delivery of the expected benefit.
Provided guidance/scripts to evaluate the orchestrator's performance and help do performance tuning.
Measuring performance and latency is crucial to ensure a timely and frictionless user experience. It's important to assess these factors to guarantee prompt and smooth delivery of the expected benefit.
Provided guidance/scripts to evaluate the orchestrator's performance and help do performance tuning.
https://github.com/Azure/gpt-rag-orchestrator/issues/53