How to reduce latency？

babytdream commented 1 week ago

Hello, I have reduced the number of web pages to be crawled from 15 to 3, but the response rate is still very slow, taking 10 seconds each time. My model uses GPT-4, which has low latency. How can I solve this problem? Thank you!

Here are my search settings: the mode is set to All Mode, and I am not using vector search, but rather the speed mode.

Could it be due to web search? Do I need to configure a proxy? My server is located in China.

omerarshad commented 1 week ago

It could be due to embedding based retrieval and context creation.

babytdream commented 1 week ago

@omerarshad I use the speed mode, it never use vector embedding.

ItzCrazyKns commented 1 week ago

10 seconds seems off. I can result with same configurations in under 5 - 7 seconds. Can you provide some more details?

babytdream commented 1 week ago

@ItzCrazyKns I tested the following modes (the ones in the red box) with the same question, and these modes can return results within 1-2 seconds, even though they fetched 15 web pages. However, when I use the ALL mode, it generally takes around 10 seconds. I only modified the src/agents/webSearchAgent.ts file, changing the number of fetched web pages from 15 to 3, i.e., if (optimizationMode === 'speed') { return docsWithContent.slice(0, 3); } My test question is:“YJWJ是一款什么游戏。只输出游戏名称，如“英雄联盟”，不要输出其他多余的内容”

10 seconds seems off. I can result with same configurations in under 5 - 7 seconds. Can you provide some more details?

ItzCrazyKns / Perplexica

How to reduce latency？ #464