langchain-ai / chat-langchain

https://chat.langchain.com
MIT License
5.16k stars 1.23k forks source link

Controlling Queries per sec(QPS) for llms on embedding or chat api #303

Open jayaraj opened 4 months ago

jayaraj commented 4 months ago

Hi,

I was trying to use the chat application with together.ai for creating embeddings and llm queries.While trying to create embeddings from the documents I am getting

INFO:httpx:HTTP Request: POST https://api.together.ai/v1/embeddings "HTTP/1.1 429 Too Many Requests"

as my QPS is 1 on free tier

Can anyone help me configure to make query per sec these apis.

thanks