-
Hello,
I have successfully applied the OCR tool to my PDFs and set the "Save output as a PDF with text layer" option in the setting. However, when opening the OCR pdf output file, it was mentioned …
-
### Description of the bug | 错误描述
Support arabic
### How to reproduce the bug | 如何复现
The letters are always **extracted in English** or the arabic text is **not recognized** and is **cut out as an …
-
Do some OCR for PDF without text inside.
-
### Simple sanity checks
- [X] This is an issue with an app that uses OCRmyPDF for OCR
- [X] I am using a recent version of the third party app
- [ ] I will include a file that reproduces the issuse
…
-
**Describe the bug**
Try to parse a pdf with `OCR_AGENT=unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision`.
**To Reproduce**
Provide a code snippet that reproduces…
-
Hi there
I tested docling on a bunch of complexes PDFs and got great results. However, I am curious of what is going on under the hood and the documentation is a bit poor on this matter :)
On fi…
-
**ISSUE:** As noted in the yml file, this is only part 2.
**RESOLUTION:** I've found the full text (as a PDF through Noor Lib). Best solution? Do I OCR or add to the OCR pipeline?
-
Currently _zotero-ocr_ requires additional installation steps for `pdftoppm` and `tesseract`.
Both could be replaced by pure JavaScript implementations which could be included in _zotero-ocr_ to si…
-
### Issues
- [X] I have browsed through the Issues. 我已浏览过Issues,确定没有重复提问。
### Umi-OCR version 程序版本
2.1.4
### Windows version 系统版本
win10
### OCR plugins Used 使用的OCR插件
PaddleOCR
### Reproduction…
-
Jen jsem narazil na nějakou divnost, aby to nebyl "standard"...
[https://www.hlidacstatu.cz/Detail/12149868?qs=ico%3A03083543](https://www.hlidacstatu.cz/Detail/12149868?qs=ico%3A03083543
)
Smlou…