Bing Search extension reaches the maximum of the token

Hi. First of all, this extension works fine on GPT- 4 model. But the performance on GPT-4 is very slow at the moment. So I switched to using GPT 3.5 1106 instead but I got this error message for a simple question "give me azure latest news"

"This model's maximum context length is 16187 tokens, however you requested 16386 tokens (16038 in your prompt; 348 for the completion). Please reduce your prompt; or completion length."

I am not quite sure why it uses more 16K token for a single search. But it could probably take the entire Bing result and pass to GPT3.5 model. How can we limit the number of results we want from Bing API and also in the Extension setting, It would be nice if we could specify to include or exclude fields that we don't need to pass to the model, it will reduce a lot of token consumption.

microsoft / azurechat

Bing Search extension reaches the maximum of the token #320