adithya-s-k / omniparse

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks
https://docs.cognitivelab.in
GNU General Public License v3.0
5.13k stars 430 forks source link

Why is the OCR model loading so slowly? Please check the screenshot. #19

Closed seasoncool closed 3 months ago

seasoncool commented 3 months ago

my command is : docker run --gpus all -e HF_ENDPOINT=https://hf-mirror.com -p 8855:8000 savatar101/omniparse:0.1

image

chong-w commented 3 months ago

OCR unable to recognize Chinese, outputting many garbled characters

adithya-s-k commented 3 months ago

@chong-w We are currently using SuryaOCR and Marker for parsing the documents which might have some limitation with respect to chinese

please refer to the following repository for futher insights Marker and Surya

adithya-s-k commented 3 months ago

@seasoncool generally loading the OCR and Detection models shouldnt take that much time

Please try to re-run the docker container

Let me know if this issue persists

seasoncool commented 3 months ago

thanks all , after re-run , we can see the page. close it .