MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.09k stars 21.14k forks source link

The FAQ page says there's a token limit on system message #115811

Closed iamgyt closed 8 months ago

iamgyt commented 9 months ago

Is there a token limit on the system message?

Is this true? I tested passing more than 3000 tokens and it worked well, nothing was ignored.


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

iamgyt commented 9 months ago

I might have misunderstood this, is this limit only applys to "Using your data" feature?

AjayBathini-MSFT commented 9 months ago

@iamgyt Thanks for your feedback! We will investigate and update as appropriate.

RamanathanChinnappan-MSFT commented 8 months ago

@iamgyt The token limit for the gpt-35-turbo model is 4096 tokens, which includes the token count from both the prompt and completion. This means that the number of tokens in the prompt combined with the value of the max_tokens parameter must stay under 4096 or you'll receive an error.

Regarding the system message, there is no specific token limit for it, but it is important to keep in mind that the token limit applies to the entire prompt, which includes the system message, examples, message history, and user query. So, if the system message is too long, it can affect the number of tokens available for the rest of the prompt. Regarding your question about the "Using your data" feature, I'm not sure what you're referring to. Kindly elaborate more information

iamgyt commented 8 months ago

Hi @RamanathanChinnappan-MSFT I knew the token limits of each model. What I meant was, check these lines in the FAQ file: articles/ai-services/openai/faq.yml: from line 198 to line 201 """- question: | Is there a token limit on the system message? answer: Yes, the token limit on the system message is 400. If the system message is more than 400 tokens, the rest of the tokens beyond the first 400 will be ignored. """

two points:

  1. Here it mentioned "the token limit on the system message is 400" which I doubted it was wrong information. If it IS wrong you would need to correct it.
  2. If this statement applies only to "Using your data" feature, you may want to explicitly explain this limit only applies to "Using your data" feature. Regarding "Using your data", you may refer to line 164 of the FAQ file: articles/ai-services/openai/faq.yml
RamanathanChinnappan-MSFT commented 8 months ago

@iamgypt, Thank you for providing more information. Based on the provided document, it is mentioned that the token limit on the system message is 400. However, it is not clear whether this limit applies to all features or only to a specific feature. As per the FAQ file, the limit of 400 tokens applies to the system message feature. This limit does not apply to other features.

Regarding "Using your data" feature, it is mentioned in line 164 of the FAQ file that the token limit for the input text is 2048 tokens. Therefore, the limit of 400 tokens does not apply to "Using your data" feature.

Please Note, GitHub forum is dedicated for docs related issues. For any technical queries or clarifications, we encourage to utilise Microsoft Q & A platform. Kindly raise your query on Microsoft Q&A Platform

iamgyt commented 8 months ago

Are you sure there's a token limit on system message? Per my testing there isn't...

Thank you for providing more information. Based on the provided document, it is mentioned that the token limit on the system message is 400. However, it is not clear whether this limit applies to all features or only to a specific feature. As per the FAQ file, the limit of 400 tokens applies to the system message feature. This limit does not apply to other features.

mrbullwinkle commented 8 months ago

To clarify this limit applies only to the on your data (retrieval augmented generation) feature within Azure Open AI. This Q&A pair was under the subheading for this feature, but it was not immediately clear that this only applies to that particular feature. This section was updated to clarify this distinction. When not using this feature the system message size is only constrained by the fact that the messages array which contains the system message cannot exceed the input token limit of the model.