[Bug]: max_input_tokens is passed to max_tokens (which is actually the max output tokens)

TylerLu commented 3 months ago

Language

C#

Version

latest

Description

I set max_input_tokens to 8000 in config.json of the default prompt:

{
  "schema": 1.1,
  "type": "completion",
  "completion": {
    "model": "gpt-4o",
    "max_input_tokens": 8000,
    "max_tokens": 1024,
    ...
  }
}

I asked the bot a question and the Bot returned the following error:

This model supports at most 4096 completion tokens, whereas you provided 8000.
Status: 400 (model_error)
Content: 
{
  "error": {
    "message": "max_tokens is too large: 8000. This model supports at most 4096 completion tokens, whereas you provided 8000.",
    "type": null,
    "param": "max_tokens",
    "code": null
  }
}

Below is the request and response captured by Fiddler:

According to OpenAI's document, max_tokens is the "the maximum number of tokens that can be generated in the chat completion". So max_tokens is accutally the max output tokens, and the max_input_tokens should not be passed to it.

Related code:

Reproduction Steps

1. Set `max_input_tokens` to 8000 in config.json of the default prompt.
2. Debug the Bot. 
3. Ask the Bot a question.

singhk97 commented 2 months ago

Good catch @TylerLu, we'll push a fix soon.

TylerLu commented 2 months ago

@singhk97 Thank you for the response! I appreciate your efforts in fixing this.

hemanthaar commented 2 months ago

We are also facing the same error and found the issue as mentioned above, Any ETA on pushing the fix. Please let us know. Thanks.

microsoft / teams-ai