infiniflow / ragflow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.
https://ragflow.io
Apache License 2.0
18.57k stars 1.88k forks source link

[Bug]: Parsing PDF file get stucked after uploading in new deployment #736

Open shaoyie opened 4 months ago

shaoyie commented 4 months ago

Is there an existing issue for the same bug?

Branch name

main

Commit ID

bef1bbdf3e16e5163bc563407bd7fd8f7da97d7a

Other environment information

No response

Actual behavior

New deployed system, create a knowledgebase in general parse method, upload a PDF file, then click start. It's always sth like 0.03% and in status Task is dispatched. Wait for long time with no change. Cancel the task and restart it again, this time the parsing task will get real started very quickly.

Expected behavior

Should be able to complete the parsing on the first time.

Steps to reproduce

1. New deployed system.
2. Create a knowledgebase in general parse method.
3. Upload a PDF file.
4. Click start parsing.
5. Observe the status get stucked in Task is dispatched.
6. Cancel the task and restart it again, the parsing task will complete quickly.

Additional information

No response

KevinHuSh commented 4 months ago

We also find this, but we are not sure the reason yet.

shaoyie commented 4 months ago

We also find this, but we are not sure the reason yet.

What I know is the version on Apr 17 works fine. So maybe check the changes related to task executor these days?

KevinHuSh commented 4 months ago

Try to upgrade it with the dev version. We fixed this.

shaoyie commented 4 months ago

Try to upgrade it with the dev version. We fixed this.

Tried with the latest code, for the first time, still met the blocking here: image

But for the followed request, it works fine. Even after recreate the container. Should be fine to go with it for now, if only the first time need manual ramp up.

timdonovanuk commented 4 months ago

This should remain open, as it's still an issue. Thanks.

shaoyie commented 4 months ago

Yes, and seems this issue happens sometime, reopen it.

jasinliu commented 1 month ago

Cancelling the task and restarting don't work for me. It's terrible

jasinliu commented 1 month ago

Maybe relate to #1383 . Hope the cause can be found and fixed as soon as possible.