CatchTheTornado / pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown
https://demo.doctractor.com
GNU General Public License v3.0
1.38k stars 92 forks source link

[feat] Test `pixtral` as a OCR strategy #41

Open pkarw opened 1 week ago

pkarw commented 1 week ago

https://mistral.ai/news/pixtral-12b/