CatchTheTornado / pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
GNU General Public License v3.0
783 stars 41 forks source link

ocr curl #11

Open Marcelas751 opened 4 days ago

Marcelas751 commented 4 days ago
C:\Users\user>curl -X POST "http://localhost:8000/ocr" -F "file=C:\Users\user\Downloads\Telegram Desktop\0a6ce636600e1723a71ff95348cd07abdc035118.pdf" -F "strategy=marker" -F "ocr_cache=true"
{"detail":[{"type":"missing","loc":["body","prompt"],"msg":"Field required","input":null},{"type":"missing","loc":["body","model"],"msg":"Field required","input":null},{"type":"value_error","loc":["body","file"],"msg":"Value error, Expected UploadFile, received: <class 'str'>","input":"C:\\Users\\user\\Downloads\\Telegram Desktop\\0a6ce636600e1723a71ff95348cd07abdc035118.pdf","ctx":{"error":{}}}]}
pkarw commented 1 day ago

OK, on it

pkarw commented 1 day ago

I think it's fixed with https://github.com/CatchTheTornado/pdf-extract-api/pull/17 @Marcelas751 please do check if it works for you now

pkarw commented 1 day ago

Check the code from master branch please