parthsarthi03 / raptor

The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
https://arxiv.org/abs/2401.18059
MIT License
878 stars 126 forks source link

When the text gets longer, the embedding API does not seem to work. #19

Open hippoley opened 6 months ago

parthsarthi03 commented 6 months ago

Are you using the default OpenAI embeddings? Is there an error message or is the code stalling?

hippoley commented 6 months ago

I have noticed that you have already added retry decorators, but the 429 response is still being triggered. Only some open-source embedding methods seem to work.

@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(6)) def create_embedding(self, text): text = text.replace("\n", " ") return ( self.client.embeddings.create(input=[text], model=self.model) .data[0] .embedding )

If I shortened the text, the embedding API would work. However, if the text gets longer, the following information must be shown:

2024-03-15 07:46:21,911 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,912 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,912 - Retrying request to /embeddings in 20.000000 seconds 2024-03-15 07:46:21,913 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,916 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,917 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,918 - Retrying request to /embeddings in 20.000000 seconds 2024-03-15 07:46:21,922 - Retrying request to /embeddings in 20.000000 seconds 2024-03-15 07:46:21,925 - Retrying request to /embeddings in 20.000000 seconds 2024-03-15 07:46:21,926 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,928 - Retrying request to /embeddings in 20.000000 seconds 2024-03-15 07:46:21,930 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,933 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests" 2024-03-15 07:46:21,934 - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 429 Too Many Requests"

parthsarthi03 commented 6 months ago

It seems like you're running into rate limiting issues due to OpenAI's API. You could request a rate limit increase from OpenAI if they allow it.

Also, currently, we utilize multithreading to build the leaf nodes. You can switch off multithreading, which will make it slower but should help avoid hitting the rate limits.

To make this change, update the following line in raptor/RetrievalAugmentation.py:

https://github.com/parthsarthi03/raptor/blob/2e3e83e5c4aa6a9b5f2d8359f5b71a9159c20845/raptor/RetrievalAugmentation.py#L219

From:

self.tree = self.tree_builder.build_from_text(text=docs)

To:

self.tree = self.tree_builder.build_from_text(text=docs, use_multithreading=False)
hippoley commented 6 months ago

image

image

Thanks, Bro, Follow your instrcution, I changed the line. But it still works in short contexts. When dealing with longer ones, timeouts still occur again.

cuichenxu commented 5 months ago

Strange! It works fine when I run the demo for the first time. But when I rerun the demo, an error occurred, the text in demo seems too long for raptor. image I can run it successfully after intercepting part of the content.

EmbedModel, QAModel and SummModel are all custom.

Follow this, the question are solved.

Update: When switching off multithreading, building a tree from a story extracted from NarrativeQA datasets costs too long...... Is there any ideas to fix this problem? @parthsarthi03