stuartleeks / aoai-simulated-api

An exploration into creating a simulated API implementation for Azure OpenAI (AOAI)
MIT License
9 stars 4 forks source link

Determine the number of tokens to use in generated chat completion responses #15

Open stuartleeks opened 3 months ago

stuartleeks commented 3 months ago

Currently the chat completion generator uses 250 tokens for the generated response.

Is this a reasonable size? Should it be configurable, does it need to vary?

stuartleeks commented 2 months ago

I think it would be good to add configuration to control the number of tokens in generated completion responses. E.g. mean % of max_tokens or similar.

Should this be a single config value for all endpoints or config per deployment in the deployment JSON file?