Open sujee opened 1 week ago
the error is quite obvious:
FileNotFoundError: [Errno 2] No such file or directory: '/root/.EasyOCR//model/temp.zip'
its either file do not exist or location is wrong
Yes, the error is quite obvious 🤣 my suspicion is its caused by a race condition between workers trying to cleanup downloaded artifacts.
Adding:
I see this consistently on Google colab, because each notebook gets their own sandbox.
To re-produce it locally, please delete the cache directory of downloaded artifacts (I am not sure where this is -- probably done by docling?)
related : #583
Yea, we know exactly why. Its up to the guys to decide what to do
Search before asking
Component
Tools/ingest2parquet
What happened + What you expected to happen
Happens when running RAY version, with NUM_WORKERS > 1. Reliably reproducible in google colab Running the cell again works.
But a negative user experience
Reproduction script
https://github.com/sujee/data-prep-kit-examples/blob/main/dpk-intro/dpk_intro_1_ray.ipynb
Use open-in-colab link : https://colab.research.google.com/github/sujee/data-prep-kit-examples/blob/main/dpk-intro/dpk_intro_1_ray.ipynb
Anything else
No response
OS
Other
Python
3.11.x
Are you willing to submit a PR?