Ability to specify differing token limits for inputs and outputs

DNGros / lmwrapper

An object-oriented wrapper around language models (like openai endpoints or huggingface)

1 stars 1 forks source link

Ability to specify differing token limits for inputs and outputs #25

Open DNGros opened 9 months ago

DNGros commented 9 months ago

In ba839a47dfe9 we add support for GPT-4-Turbo. However, GPT-4-Turbo behaves differently than other models. There is a large input limit (128,000 tokens), but a smaller output limit of 4096 tokens. We don't have any way of representing this currently, so features like checking if a prompt will go over and prompt trimming will not behave as expected.