Open viraptor opened 4 months ago
Hi viraptor,
Perhaps it took a while to perform OCR on your document. How much compute / memory is available to your Docker?
The PDF you shared is rather large, so it is advisable to split it into small pieces to allow for more parallelization (e.g. so that OCR doesn't become a blocker).
We are working on building a more efficient / performant OCR, but that will take a few weeks / months.
Describe the bug
I'm using the following config:
When I ran
r2r ingest-files
on the EC2 documentation, the app got stuck for a long time, but without doing any work (all CPUs idle, no ollama requests visible in the logs). After over 2 minutes of waiting, it processed the file in ~30 sec. (I saw a lot of ollama embedding requests coming through).Using commit 2f6f18c66858b4cf15d29accd19d7ef8016e98d4
To Reproduce
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):