Azure-Samples / chat-with-your-data-solution-accelerator

A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.

https://azure.microsoft.com/products/search

MIT License

521 stars 274 forks source link

Token error in `/custom` endpoint #790

Closed cecheta closed 2 weeks ago

cecheta commented 2 weeks ago

Describe the bug

A clear and concise description of what the bug is.

Following #648 , the size of the prompt has increased massively, and exceeds the number of tokens allowed by gpt-35-turbo (0613), the default.

https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-35-models

Expected behavior

Response generated by LLM

Debugging information

Steps to reproduce

Deploy solution
Add documents to knowledge base
Ask question that involves response from knowledge base

Screenshots

If applicable, add screenshots to help explain your problem.

Logs

If applicable, add logs to help the engineer debug the problem.

openai.BadRequestError: Error code: 400 - {'error': {'message': "This model's maximum context length is 4096 tokens. However, your messages resulted in 4339 tokens. Please reduce the length of the messages.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}