A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.
Ask question that involves response from knowledge base
Screenshots
If applicable, add screenshots to help explain your problem.
Logs
If applicable, add logs to help the engineer debug the problem.
openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4096 tokens. However, your messages resulted in 4339 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Describe the bug
A clear and concise description of what the bug is.
Following #648 , the size of the prompt has increased massively, and exceeds the number of tokens allowed by gpt-35-turbo (0613), the default.
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-35-models
Expected behavior
Response generated by LLM
Debugging information
Steps to reproduce
Screenshots
If applicable, add screenshots to help explain your problem.
Logs
If applicable, add logs to help the engineer debug the problem.