aws-samples / bedrock-claude-chat

AWS-native chatbot using Bedrock + Claude (+Mistral)
MIT No Attribution
693 stars 237 forks source link

[BUG] Slow retrieval of documents #305

Open jeremylatorre opened 1 month ago

jeremylatorre commented 1 month ago

Describe the bug

When using RAG, I noticed that performances where very low when trying to retrieve information from documents. Any lead about what is causing this. Clearly the difference between rag and standard bot is huge at that time. It might also depend on number of documents.

statefb commented 1 month ago

Could you provide the detail? Please do not ignore the issue template.

jeremylatorre commented 1 month ago

When I use my bot, it takes sometimes 17s before starting to render the content.

The first step [Retrieve Knowledge] looks ok, but after that, cursor stay at the beginning of the line for a very long time before starting to write the response.

This doesn't happen when using the LLM without documents, and it seems to increase with the number of document in the knowledge base.

statefb commented 1 month ago

Could you check the items in postgres? We can use management console to query and the table definition can be reffered here.

jeremylatorre commented 1 month ago

Could it be related to the instantion of a Lambda function? The behavior only appear on the first inference for a new conversation

statefb commented 1 month ago

Have you forked and customized this sample? Large container causes longer cold start. You can check whether the invocation is cold or not by reffering cloudwatch logs.
https://stackoverflow.com/questions/47061146/how-can-i-detect-cold-starts-in-aws-lambda

jeremylatorre commented 3 weeks ago

Ok I have digged a llittle bit on the logs and I found that the POST call to related-documents is very long by itself:

Request POST /conversation/related-documents (15:15:32 - 15:15:44) Duration : 11697 ms

I'm not currently up to date with the codebase and we will perform a full update soon. I'll keep you updated on this.

statefb commented 3 weeks ago

@jeremylatorre Thank you for the update. The query part to postgres seems to making the very long latency. For the cost effectiveness, the ACU configured at minimum value by default. Choosing larger value might solve the problem .