run-llama / llama_parse

Parse files for optimal RAG
https://www.llamaindex.ai
MIT License
2.72k stars 263 forks source link

processing time #175

Closed gideononyewuenyi closed 1 month ago

gideononyewuenyi commented 4 months ago

why does it take so much time to parse, 30 minutes and still waiting .......

rwclayton commented 4 months ago

Hey, @gideononyewuenyi - I've been having issues all day. Will you let me know if you get a server error?

gideononyewuenyi commented 4 months ago

Started parsing the file under job_id 50001bc4-ca6c-4c7b-842b-ef9942ebf5df Error while parsing the PDF file '/Users/mac/Project-Hillda-AI/Gideon-Onyewuenyi-Resume.pdf': Server disconnected without sending a response. Failed to load file /Users/mac/Project-Hillda-AI/Gideon-Onyewuenyi-Resume.pdf with error: Server disconnected without sending a response.. Skipping... []

@rwclayton

rwclayton commented 4 months ago

Have had the same issue all day, @gideononyewuenyi. Must be an issue with the API. Maybe the server is down? Hopefully we'll have answers soon.

gideononyewuenyi commented 4 months ago

I think it works now, thank you.

qboy21 commented 4 months ago

This is one pdf document (48mb) and running for over 1 hour. A nice start but too slow.

llparse_1doc

RemoteProtocolError: Server disconnected without sending a response.

logan-markewich commented 4 months ago

@qboy21 can you share the PDF? Some documents will be slower if they have many images that need to be OCR'd

qboy21 commented 4 months ago

@qboy21 can you share the PDF? Some documents will be slower if they have many images that need to be OCR'd

Hi. Unfortunately, it's proprietary data that I cannot share. I can try with a public file of similar size and then revert back to you.

BinaryBrain commented 1 month ago

Closing this one since we improve quite a lot on this point. If you see any performance issue again, please reopen!