microsoft / azurechat

🤖 💼 Azure Chat Solution Accelerator powered by Azure Open AI Service
MIT License
1.14k stars 954 forks source link

Bing Search extension reaches the maximum of the token #320

Open sonphnt opened 4 months ago

sonphnt commented 4 months ago

Hi. First of all, this extension works fine on GPT- 4 model. But the performance on GPT-4 is very slow at the moment. So I switched to using GPT 3.5 1106 instead but I got this error message for a simple question "give me azure latest news"

"This model's maximum context length is 16187 tokens, however you requested 16386 tokens (16038 in your prompt; 348 for the completion). Please reduce your prompt; or completion length."

I am not quite sure why it uses more 16K token for a single search. But it could probably take the entire Bing result and pass to GPT3.5 model. How can we limit the number of results we want from Bing API and also in the Extension setting, It would be nice if we could specify to include or exclude fields that we don't need to pass to the model, it will reduce a lot of token consumption.

Kirchen99 commented 4 months ago

I have also faced the "token limit" problem when I am using gpt-35-turbo 0301 for a simple document intelligence query "Please describe the document".

The document is a normal human readable pdf file with 43 pages.