Azure-Samples / chat-with-your-data-solution-accelerator

A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.
https://azure.microsoft.com/products/search
MIT License
521 stars 274 forks source link

Token error in `/custom` endpoint #790

Closed cecheta closed 2 weeks ago

cecheta commented 2 weeks ago

Describe the bug

A clear and concise description of what the bug is.

Following #648 , the size of the prompt has increased massively, and exceeds the number of tokens allowed by gpt-35-turbo (0613), the default.

https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-35-models

Expected behavior

Response generated by LLM


Debugging information

Steps to reproduce

Screenshots

If applicable, add screenshots to help explain your problem.

Logs

If applicable, add logs to help the engineer debug the problem.

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4096 tokens. However, your messages resulted in 4339 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}