[feat] Test and add Llama 3.2-vision as OCR strategy

CatchTheTornado / pdf-extract-api

Document (PDF) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

https://demo.doctractor.com

GNU General Public License v3.0

1.33k stars 86 forks source link

[feat] Test and add Llama 3.2-vision as OCR strategy #27

Closed pkarw closed 4 days ago

pkarw commented 2 weeks ago

https://ollama.com/library/llama3.2-vision

pkarw commented 2 weeks ago

In that case we need to convert PDF to images first - before sending it to llama3.2

pkarw commented 2 weeks ago

Tested. Works really great: