Investigate mechanisms to reduce latency

ClemensGruber / climart-gptree

MIT License

2 stars 1 forks source link

Investigate mechanisms to reduce latency #24

Open ClemensGruber opened 7 months ago

ClemensGruber commented 7 months ago

Todo

check if any hint in OpenAI doc > Improving latencies https://platform.openai.com/docs/guides/production-best-practices/improving-latencies fits for our problem
2
31

Already tested and less effect found on the latency of the entire system

4
- 30
- 27

Workaround

ClemensGruber commented 6 months ago

Good catch:

OpenAI API and other LLM APIs response time tracker https://gptforwork.com/tools/openai-api-and-other-llm-apis-response-time-tracker

Generally GPT 4 seems to have more latency than 3.5 and the most interesting part: Response time is at arround 2 sec at GPT 3.5 while we have to wait 5 sec with GPT 4!

We have 20 to 40 seconds waiting time with GPTree so it seems as a bigger problemm in our implementation!

ClemensGruber / climart-gptree

Investigate mechanisms to reduce latency #24

Todo

2

31

Already tested and less effect found on the latency of the entire system

4

30

27

Workaround

28