Add ability to extract text from scanned pdfs

skvark / Textractor

OCR application for Sailfish OS. Based on Tesseract OCR engine and Leptonica image processing library.

MIT License

20 stars 6 forks source link

Add ability to extract text from scanned pdfs #9

Closed Tsippaduida closed 9 years ago

Tsippaduida commented 9 years ago

Scanners often create pdf-documents when you use them for copying. It would be real handy if the text in those pdfs would be available.

skvark commented 9 years ago

This will require some conversion library which converts pdf to image. I'll look into it.

skvark commented 9 years ago

This has been now implemented. Some PDF files might not work. However, that can't be fixed at Textractor side.

Tsippaduida commented 9 years ago

Thanks,

This is great news. Textractor in one of the best OCR engines available.

19.09.2015, 18:17, Olli-Pekka Heinisuo kirjoitti:

This has been now implemented. Some PDF files might not work. However, that can't be fixed at Textractor side.

— Reply to this email directly or view it on GitHub https://github.com/skvark/Textractor/issues/9#issuecomment-141679099.