Closed ericmoret closed 6 years ago
Hi Eric, thanks for the message. Please post the original PDF / image with this behavior. Please, also inform: operating system, python version, PDF reader. Doest "-w" flag generates a correct TXT file?
Hello Leo,
I cannot post the original PDF file. However here is what I observed: MacOS High Sierra 10.13.2, python 3.4.7 When I use the native Preview.app Version 10.0 (944.4), I see missing spaces after copy/paste. However when I use Adobe Acrobat Reader DC 2018.0009.20050, I see the expected spaces after copy/paste. the -w option also shows proper spacing in the output txt file
Hi Eric, thanks! Please try to run with "-e native" / "-e tesseract" and with or without "-p" flag. I can copy / paste from Preview.app correctly in some (rare) cases. Let me know the results! Leo
Closing this as it seems to be a "Preview.app" issue and we have a workaround.
When I use pdf2pdfocr, the text generated includes no space between the words recognized. As a result when I copy/paste the resulting text it is difficult to use as I have to manually reintroduce all missing spaces.