Closed gameveloster closed 8 months ago
Perhaps not quite the same scenario, but I'm getting exactly the same error when running the VectorDB Question Answering with Sources example.
Perhaps adding some exponential backoff as OpenAI recommend?
I ran into rate limits when using FAISS.from_texts
on one markdown file with ~800 lines with the Question Answering with Sources sample. I worked around it like this. Posting in case it is useful for other users:
def chunks(lst, n):
# https://stackoverflow.com/a/312464/18903720
"""Yield successive n-sized chunks from lst."""
for i in range(0, len(lst), n):
yield lst[i:i + n]
text_chunks = chunks(texts, 20) # adjust 20 based on your average character count per line
docsearch = None
for (index, chunk) in tqdm.tqdm(enumerate(text_chunks)):
if index == 0:
docsearch = FAISS.from_texts(texts, embeddings)
else:
time.sleep(60) # wait for a minute to not exceed any rate limits
docsearch.add_texts(chunk)
Didn't work for me. Did OpenAI change something or am I missing something here?
Can you please help me?
Same for me today with the example at https://python.langchain.com/en/latest/use_cases/code/code-analysis-deeplake.html
Is their a way to integrate a solution to the example code to avoid it?
Still having the same issue, I tried something like this:
embeddings = OpenAIEmbeddings()
vector_store = FAISS.from_texts(texts=["example1", "example2"], embedding=embeddings)
and
vector_store = Chroma.from_texts(texts=["example1", "example2"], embedding=embeddings)
Got:
Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..
I'm passing a list that has a length of 2, and it is giving me RateLimitError.
Tried two versions of Langchain, 0.0.162 and 0.0.188, and both appeared with the same error.
I am running into the same issue, when using the function:
Chroma.from_texts
Did anyone manage to come up with a solution which gets around the rate limit.
Thinking of looping through texts in try except, and adding a sleep function for when the RateLimit is reached, then retrying.
Any solution?
Is this the same issue you guys are getting?
Retrying langchain.embeddings.openai.embed_with_retry.
Is this the same issue you guys are getting?
Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..
yes
@getsean @ImcLiuQian (and any other that get « You exceeded your current quota » in the error message) : this has nothing to do with the original question. please see https://github.com/langchain-ai/langchain/issues/11914 instead.
The solution is to implement an exponential backoff or just a simple 10 second wait. Use try, except block and when the exception is hit, simply wait 10 seconds before running the function again.
I ran into rate limits when using
FAISS.from_texts
on one markdown file with ~800 lines with the Question Answering with Sources sample. I worked around it like this. Posting in case it is useful for other users:def chunks(lst, n): # https://stackoverflow.com/a/312464/18903720 """Yield successive n-sized chunks from lst.""" for i in range(0, len(lst), n): yield lst[i:i + n] text_chunks = chunks(texts, 20) # adjust 20 based on your average character count per line docsearch = None for (index, chunk) in tqdm.tqdm(enumerate(text_chunks)): if index == 0: docsearch = FAISS.from_texts(texts, embeddings) else: time.sleep(60) # wait for a minute to not exceed any rate limits docsearch.add_texts(chunk)
Is there a way to do the same for FAISS.from_documents() ?
I tried below method and it works for me,
vector_store = <your_vector_store>
documents = loader.load() #any loader that you used
for text in documents:
vector_store.add_documents([text])
I'm getting an openai
RateLimitError
when embedding my chunked texts with"text-embedding-ada-002"
, which I have rate limited to 8 chunks of <1024 every 15 secs.Every 15 seconds, I'm calling this once;
The chunks list
chunked
was created usingWhy is my request rate exceeding 70/min when I'm only embedding at ~32 chunks/min? Does each chunk take more than 1 request to process?
Anyway to better rate limit my embedding queries? Thanks