ethereal-developers / OpenScan

A privacy-friendly Document Scanner app
BSD 3-Clause "New" or "Revised" License
1.41k stars 81 forks source link

[Feature Request] Add text layer to PDF with OCR #32

Open klawdhfzasjhaa opened 3 years ago

klawdhfzasjhaa commented 3 years ago

Most desk scanners offer an included OCR engine for the output files to be searchable. This feature would make the app perfect for most users. Maybe something like tesseract can be integrated. Or even a third party app like this: OCR (Tesseract) (Optical character recognition (OCR) functionality based on Tesseract via Intents) - https://f-droid.org/packages/org.totschnig.ocr.tesseract.

"This app bundles OCR functionality (based on Tesseract) that can be called from other apps via Intents. It listens for Intents with action "org.totschnig.ocr.action.RECOGNIZE" and expects an Uri pointing to a JPEG file as data. The recognized text is passed back in the extra "result" as an object of a parcelable data class Text, that must be copied into the client app."

Seems perfect for this project.

natrius commented 2 years ago

Just to add another one https://f-droid.org/de/packages/io.github.subhamtyagi.ocr/ - it uses Tesseract 5 and seems to in active development.