Open bbrowning opened 4 hours ago
This will likely also imply we need to adjust our docling in requirements.txt
to pull in docling[tesserocr]
instead of docling
. The tesserocr
variant pulls in both tesserocr and easyocr, allowing us to swap between each with the single dependency.
Docling defaults to using
easyocr
for optical character recognition, but we have some downstream consumers that will prefer to use Docling'stesserocr
for OCR. We need to expose a way for users to influence which we use, as it requires code changes in our Docling integration to swap the OCR engine used.