-
Docling defaults to using `easyocr` for optical character recognition, but we have some downstream consumers that will prefer to use Docling's `tesserocr` for OCR. We need to expose a way for users to…
-
I have worked a bit with [label-studio](https://github.com/heartexlabs/label-studio) and I would like to integrate it further as a trial way of graphical annotation of various digital assets. Here is …
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
How to solve the problem of garbled characters caused by pictures in the read doc
-
请问大佬,qwen2-vl 的pretrain是否有计划支持呢
-
I suppose this is a feature request, unless it's assumed the feature is already present, in which case it's a bug report.
Please add support for building the leptonica library targeting Universal W…
-
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
Often, users encounter images contai…
-
Dump TOC of pdfs and build a learning plan.
track progress
-
On loading applicants, extract the text from the candidate pack and
1. Pull out specific information from the cover sheet, and store in appropriate fields
- First Name + Last Name can replace th…
-
There is some base knowledge required to allow a computer to understand and process a given image to understand the elements within.
We should look into how to set up the environment as well!
-
I'm new to using AI, and I'm looking for guidance on how to extract invoice details from PDF files, similar to how it's done for images. Can you provide some suggestions or steps to achieve this?
Tha…