Closed dibu28 closed 2 years ago
@osnofas You'll have to be careful to ensure that you install ocrmypdf to the same native Windows Python distribution that you want to run it. The directory listing of "Windows proper" shows ocrmypdf was installed to a different distribution.
You may need to create a virtual environment and install ocrmypdf to there.
Trying to get this working. unpaper is not found, and choco does not appear to have a package for it.
Anyone have tips/tricks to install 64bit windows binary for unpaper to support --clean option?
FWIW: I found a pre-built 6.2 binary, but that didn't work. It runs (and shows me a version), but ocrmypdf dumps a stack-trace when it tries to use it. (I put it in my PATH)
Hey! I know this is an old thread but I just installed Python 3.8, Tesseract 5.0.0, Ghostscript, pngquant, and ocrmypdf. When I execute ocrmypdf --help
, I get the following:
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\javegaa\AppData\Local\Programs\Python\Python38-32\Scripts\ocrmypdf.exe\__main__.py", line 5, in <module>
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\site-packages\ocrmypdf\__init__.py", line 18, in <module>
from . import helpers, hocrtransform, leptonica, pdfa, pdfinfo
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\site-packages\ocrmypdf\leptonica.py", line 70, in <module>
lept = ffi.dlopen(_libpath)
OSError: cannot load library 'C:\Program Files\Tesseract-OCR\liblept-5.dll': error 0xc1
Thanks for the help, really appreciate this.
You installed 32-bit Python and 64-bit Tesseract, and these can't interface.
Use 64-bit Python instead.
On Mon., May 25, 2020, 13:32 Jose A. Vega, notifications@github.com wrote:
Hey! I know this is an old thread but I just installed Python 3.8, Tesseract 5.0.0, Ghostscript, pngquant, and ocrmypdf. When I execute ocrmypdf --help, I get the following:
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\javegaa\AppData\Local\Programs\Python\Python38-32\Scripts\ocrmypdf.exe__main.py", line 5, in
File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\site-packages\ocrmypdf\ init__.py", line 18, infrom . import helpers, hocrtransform, leptonica, pdfa, pdfinfo File "c:\users\javegaa\appdata\local\programs\python\python38-32\lib\site-packages\ocrmypdf\leptonica.py", line 70, in lept = ffi.dlopen(_libpath) OSError: cannot load library 'C:\Program Files\Tesseract-OCR\liblept-5.dll': error 0xc1 Thanks for the help, really appreciate this.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jbarlow83/OCRmyPDF/issues/455#issuecomment-633708439, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAN5YMYPE5Y2PBKYW7GGRLTRTLIWNANCNFSM4JMYDTLA .
import ocrmypdf Traceback (most recent call last):
File "
File "C:\Users\22252\AppData\Roaming\Python\Python38\site-packages\ocrmypdf__init__.py", line 10, in
File "C:\Users\22252\AppData\Roaming\Python\Python38\site-packages\ocrmypdf\leptonica.py", line 62, in
Please let me know how to fix this ??
when installing tesseract from conda package, leptonica is installed but on windows the name is leptonica-x.x.x.dll not the way it is spelled in leptonica.py
Maybe instead of list all fashion of how leptonica lib could be written, is there another way to test it ? I do not know
but now, on windows I can not use ocrmypdf on windows with conda or mamba env.
https://anaconda.org/conda-forge/leptonica https://github.com/conda-forge/leptonica-feedstock
leptonica is not longer a dependency - this should resolve the remaining Windows issues.
Hi
Describe the issue I've managed to run OCRmyPDF.exe on Windows 10 without wsl.
To Reproduce I've made fork and added some quick fixes in this commit: https://github.com/dibu28/OCRmyPDF/commit/543088e79e8649e968d02d8fd268123255607dc1
Fixes are: 1) in leptonica.py librray name is liblept-5 instead of lept 2) in ghostscript.py 2.1) executable name is gswin64c.exe instead of gs 2.2) NamedTemporaryFile doesnt work properly and gs could not modify tmp file with access denied error. (so as a temporary workaround I'm adding "_1" to temp file name and then removing file. There could be some better way) 3) in _pipeline.py and helpers.py files - symlinking to temp folder on windows requires Admin privelegies. So instead of simlinking I'm just copying files. 4) in _sync.py file - os.path.samefile is returning error: "OSError: [WinError 1] Incorrect function: 'nul'"
So after those changes and installin dependencies it started to work from command line like this: OCRmyPDF.exe input.pdf output.pdf
Dependencies and binaries I'm using: https://www.python.org/ftp/python/3.7.5/python-3.7.5-amd64.exe https://digi.bib.uni-mannheim.de/tesseract/tesseract-ocr-w64-setup-v5.0.0-alpha.20191030.exe https://github.com/ArtifexSoftware/ghostpdl-downloads/releases/download/gs950/gs950w64.exe https://github.com/qpdf/qpdf/releases/download/release-qpdf-9.0.2/qpdf-9.0.2-bin-msvc64.zip
Add paths to PATH variable: set PATH=%PATH%;C:\Program Files\Tesseract-OCR; set PATH=%PATH%;C:\Program Files\gs\gs9.50\bin\; set PATH=%PATH%;C:\qpdf\qpdf-9.0.2-bin-msvc64\qpdf-9.0.2\bin\;
Expected behavior Can we add some workarounds using conditions based on os type?
System:
Additional context