ItzCrazyKns / Perplexica

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI
MIT License
16.54k stars 1.55k forks source link

How to reduce latency? #464

Open babytdream opened 1 week ago

babytdream commented 1 week ago

Hello, I have reduced the number of web pages to be crawled from 15 to 3, but the response rate is still very slow, taking 10 seconds each time. My model uses GPT-4, which has low latency. How can I solve this problem? Thank you!

Here are my search settings: the mode is set to All Mode, and I am not using vector search, but rather the speed mode.

Could it be due to web search? Do I need to configure a proxy? My server is located in China.

image

omerarshad commented 1 week ago

It could be due to embedding based retrieval and context creation.

babytdream commented 1 week ago

@omerarshad I use the speed mode, it never use vector embedding. image

ItzCrazyKns commented 1 week ago

10 seconds seems off. I can result with same configurations in under 5 - 7 seconds. Can you provide some more details?

babytdream commented 1 week ago

image

@ItzCrazyKns I tested the following modes (the ones in the red box) with the same question, and these modes can return results within 1-2 seconds, even though they fetched 15 web pages. However, when I use the ALL mode, it generally takes around 10 seconds. I only modified the src/agents/webSearchAgent.ts file, changing the number of fetched web pages from 15 to 3, i.e., if (optimizationMode === 'speed') { return docsWithContent.slice(0, 3); } My test question is:“YJWJ是一款什么游戏。只输出游戏名称,如“英雄联盟”,不要输出其他多余的内容” image

10 seconds seems off. I can result with same configurations in under 5 - 7 seconds. Can you provide some more details?