hiroi-sora / Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。
MIT License
23.05k stars 2.35k forks source link

Add an option to generate one layer PDF file output #487

Closed tonyho closed 1 month ago

tonyho commented 2 months ago

Issues

Expected behavior 预期的功能

Except the 2 layered PDF, can you kindly to add an one layer pdf export option? Or can I extract the OCRed(text) PDF layer using other free tools?

Approximate reference (optional) 近似的参考(可选)

None

hiroi-sora commented 2 months ago

I've just added this feature; you can pull from the main branch to test it. (Alternatively, you can copy the UmiOCR-data directory from the main branch and overwrite the same directory in the Release v2.1.1 .)

In the Batch Documents OCR tab, check the Advanced option in settings. Now, the Output file format will have an additional option: .pdf One-layer plain text document. This allows for the output of a single-layer PDF containing only the OCR results, without the original image layers.

image