CatchTheTornado / pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
https://demo.doctractor.com
GNU General Public License v3.0
1.38k stars 92 forks source link

OCR task failed. #28

Closed PoleGeogry closed 2 weeks ago

PoleGeogry commented 2 weeks ago

hello, I used an Apple device to reproduce locally, but encountered the following situation:

{'state': 'FAILURE', 'status': 'Server disconnected without sending a response.'} OCR task failed.

I use : python client/cli.py ocr --file examples/example-mri.pdf --ocr_cache --prompt_file=examples/example-mri-remove-pii.txt how to solve this problem?

PoleGeogry commented 2 weeks ago

turn off VPN

pkarw commented 1 week ago

Thanks @PoleGeogry! It might be usefull for other users as well, the problem with VPN was with OCR trying to get weights from internet or by accessing local server ports?