ocrmypdf / OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
http://ocrmypdf.readthedocs.io/
Mozilla Public License 2.0
14.1k stars 1.02k forks source link

ocrmypdf keyword 'allow_abbrev' problem #740

Closed frohro closed 2 years ago

frohro commented 3 years ago

I have this problem on Ubuntu 20.10 using a recently pip3 installed ocrmypdf (ocrmypdf-11.4.5), with python 3.8.6.

$ ocrmypdf Hambley\ 12.4-12.5.pdf Hambley12.4-12.5.pdf
Traceback (most recent call last):
  File "/home/frohro/.local/bin/ocrmypdf", line 5, in <module>
    from ocrmypdf.__main__ import run
  File "/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__init__.py", line 13, in <module>
    from ocrmypdf.api import Verbosity, configure_logging, ocr
  File "/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/api.py", line 19, in <module>
    from ocrmypdf._plugin_manager import get_plugin_manager
  File "/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_plugin_manager.py", line 20, in <module>
    from ocrmypdf.cli import get_parser, plugins_only_parser
  File "/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/cli.py", line 476, in <module>
    plugins_only_parser = ArgumentParser(
  File "/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/cli.py", line 40, in __init__
    super().__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'allow_abbrev'

I suspect it is some problem with my installation, but I'm not sure.

Thanks,

Rob

jbarlow83 commented 3 years ago

Try python -m ocrmypdf ... to see if it works like that, and python -c 'import argparse; print(argparse.__file__)' to see if argparse is coming from an official-looking in /usr/lib rather than your /home.

If you're not familiar with user mode installs on Ubuntu:

You could also try, for sanity's sake, creating a new Python virtual environment and installing ocrmypdf into that.

frohro commented 3 years ago

Thanks for the things to try.  On Ubuntu 20.10 pip3 defaults to user installs, so mine is installed in my local directory.  Here is what I get:

$ python3 -m ocrmypdf
Traceback (most recent call last):
   File "/usr/lib/python3.8/runpy.py", line 185, in _run_module_as_main
     mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
   File "/usr/lib/python3.8/runpy.py", line 144, in _get_module_details
     return _get_module_details(pkg_main_name, error)
   File "/usr/lib/python3.8/runpy.py", line 111, in _get_module_details
     __import__(pkg_name)
   File 
"/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__init__.py", 
line 13, in <module>
     from ocrmypdf.api import Verbosity, configure_logging, ocr
   File 
"/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/api.py", line 
19, in <module>
     from ocrmypdf._plugin_manager import get_plugin_manager
   File 
"/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_plugin_manager.py", 
line 20, in <module>
     from ocrmypdf.cli import get_parser, plugins_only_parser
   File 
"/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/cli.py", line 
476, in <module>
     plugins_only_parser = ArgumentParser(
   File 
"/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/cli.py", line 
40, in __init__
     super().__init__(*args, **kwargs)
TypeError: __init__() got an unexpected keyword argument 'allow_abbrev'
frohro@frohro-260:~$  python3 -c 'import argparse; print(argparse.__file__)'
/home/frohro/.local/lib/python3.8/site-packages/argparse.py

This isn't the same computer, but a similar one with the same problem.

I'm not a python expert, so your help is appreciated!

Thanks,

Rob

On 2/25/21 2:18 PM, jbarlow83 wrote:

CAUTION: This email originated from outside the Walla Walla University email system.

Try |python -m ocrmypdf ...| to see if it works like that, and |python -c 'import argparse; print(argparse.file)'| to see if argparse is coming from an official-looking in |/usr/lib| rather than your |/home|.

If you're not familiar with user mode installs on Ubuntu:

You could also try, for sanity's sake, creating a new Python virtual environment and installing ocrmypdf into that.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fjbarlow83%2FOCRmyPDF%2Fissues%2F740%23issuecomment-786265564&data=04%7C01%7Crob.frohne%40wallawalla.edu%7C06274dc59a854ff5407808d8d9db4c7b%7Cd958f048e43142779c8debfb75e7aa64%7C0%7C0%7C637498883203813885%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=YBBi8RfGaC3lAHYj0VlzHQEm3CpEkLVlW3SQbma%2BIF0%3D&reserved=0, or unsubscribe https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAAHS2BOP5DNW7PMWQDDC4C3TA3EDZANCNFSM4YHHDWWQ&data=04%7C01%7Crob.frohne%40wallawalla.edu%7C06274dc59a854ff5407808d8d9db4c7b%7Cd958f048e43142779c8debfb75e7aa64%7C0%7C0%7C637498883203823884%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=lxAVb7j1HsRDOpTCgmayqU5msmsTR6KZ%2BAI%2Feyw4yV0%3D&reserved=0.

-- Rob Frohne, Ph.D. P.E. E. F. Cross School of Engineering Walla Walla University 100 SW 4th Street College Place, WA 99362 (509) 527-2075

jbarlow83 commented 3 years ago

I tried to reproduce this on Docker Ubuntu 20.10 but could not, either as a root or as a unprivileged user calling pip.

Maybe you have an old version of pip or something other than Ubuntu pip installed?

You could try a virtual environment:

python -m venv venv
source venv/bin/activate  # assuming bash
pip install -U pip wheel setuptools  # upgrade pip+setuptools+wheel to latest
pip install ocrmypdf

You will then be able to use $PWD/bin/ocrmypdf

frohro commented 3 years ago

Thanks, I tried it, but there was no ocrmypdf in the bin/ directory. Here is how it went.

$ pip3 install -U pip wheel setuptools
Requirement already satisfied: pip in /home/frohro/.local/lib/python3.8/site-packages (21.0.1)
Requirement already satisfied: wheel in /home/frohro/.local/lib/python3.8/site-packages (0.36.2)
Requirement already satisfied: setuptools in /home/frohro/.local/lib/python3.8/site-packages (53.0.0)
Collecting setuptools
  Downloading setuptools-54.0.0-py3-none-any.whl (784 kB)
     |████████████████████████████████| 784 kB 2.2 MB/s 
Installing collected packages: setuptools
  Attempting uninstall: setuptools
    Found existing installation: setuptools 53.0.0
    Not uninstalling setuptools at /home/frohro/.local/lib/python3.8/site-packages, outside environment /home/frohro/Downloads/venv
    Can't uninstall 'setuptools'. No files were found to uninstall.
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
west 0.8.0 requires colorama, which is not installed.
west 0.8.0 requires PyYAML>=5.1, which is not installed.
spyder 4.2.1 requires chardet>=2.0.0, which is not installed.
spyder 4.2.1 requires keyring>=17.0.0, which is not installed.
spyder 4.2.1 requires pexpect>=4.4.0, which is not installed.
spyder 4.2.1 requires pyxdg>=0.26; platform_system == "Linux", which is not installed.
sphinx 3.2.1 requires requests>=2.5.0, which is not installed.
scikit-build 0.11.1 requires distro, which is not installed.
jsonschema 3.2.0 requires six>=1.11.0, which is not installed.
ipython 7.19.0 requires pexpect>4.3; sys_platform != "win32", which is not installed.
breathe 4.23.0 requires six>=1.9, which is not installed.
Successfully installed setuptools-54.0.0
(venv) frohro@frohro-A475:~/Downloads$ pip3 install ocrmypdf
Requirement already satisfied: ocrmypdf in /home/frohro/.local/lib/python3.8/site-packages (11.6.2)
Collecting tqdm>=4
  Using cached tqdm-4.58.0-py2.py3-none-any.whl (73 kB)
Collecting pluggy<1.0,>=0.13.0
  Using cached pluggy-0.13.1-py2.py3-none-any.whl (18 kB)
Collecting img2pdf<0.5,>=0.3.0
  Using cached img2pdf-0.4.0-py3-none-any.whl
Collecting pdfminer.six!=20200720,<=20201018,>=20191110
  Using cached pdfminer.six-20201018-py3-none-any.whl (5.6 MB)
Collecting Pillow>=7.0.0
  Using cached Pillow-8.1.0-cp38-cp38-manylinux1_x86_64.whl (2.2 MB)
Collecting pikepdf<3,>=1.14.0
  Downloading pikepdf-2.8.0-cp38-cp38-manylinux2010_x86_64.whl (2.3 MB)
     |████████████████████████████████| 2.3 MB 2.3 MB/s 
Requirement already satisfied: cffi>=1.9.1 in /home/frohro/.local/lib/python3.8/site-packages (from ocrmypdf) (1.14.4)
Collecting reportlab>=3.3.0
  Using cached reportlab-3.5.59-cp38-cp38-manylinux2010_x86_64.whl (2.6 MB)
Requirement already satisfied: coloredlogs>=14.0 in /home/frohro/.local/lib/python3.8/site-packages (from ocrmypdf) (15.0)
Requirement already satisfied: pycparser in /home/frohro/.local/lib/python3.8/site-packages (from cffi>=1.9.1->ocrmypdf) (2.20)
Requirement already satisfied: humanfriendly>=9.1 in /home/frohro/.local/lib/python3.8/site-packages (from coloredlogs>=14.0->ocrmypdf) (9.1)
Collecting chardet
  Using cached chardet-4.0.0-py2.py3-none-any.whl (178 kB)
Collecting sortedcontainers
  Using cached sortedcontainers-2.3.0-py2.py3-none-any.whl (29 kB)
Collecting cryptography
  Using cached cryptography-3.4.6-cp36-abi3-manylinux2014_x86_64.whl (3.2 MB)
Collecting lxml>=4.0
  Using cached lxml-4.6.2-cp38-cp38-manylinux1_x86_64.whl (5.4 MB)
Installing collected packages: Pillow, lxml, sortedcontainers, pikepdf, cryptography, chardet, tqdm, reportlab, pluggy, pdfminer.six, img2pdf
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spyder 4.2.1 requires keyring>=17.0.0, which is not installed.
spyder 4.2.1 requires pexpect>=4.4.0, which is not installed.
spyder 4.2.1 requires pyxdg>=0.26; platform_system == "Linux", which is not installed.
matplotlib 3.3.3 requires python-dateutil>=2.1, which is not installed.
manimlib 0.1.11 requires pycairo, which is not installed.
ipyvolume 0.6.0a6 requires requests, which is not installed.
Successfully installed Pillow-8.1.0 chardet-4.0.0 cryptography-3.4.6 img2pdf-0.4.0 lxml-4.6.2 pdfminer.six-20201018 pikepdf-2.8.0 pluggy-0.13.1 reportlab-3.5.59 sortedcontainers-2.3.0 tqdm-4.58.0
(venv) frohro@frohro-A475:~/Downloads$ which ocrmypdf
(venv) frohro@frohro-A475:~/Downloads$ cd bin
bash: cd: bin: No such file or directory
(venv) frohro@frohro-A475:~/Downloads$ cd venv/bin
(venv) frohro@frohro-A475:~/Downloads/venv/bin$ ls
activate       chardetect        img2pdf      pip3         python3
activate.csh   dumppdf.py        img2pdf-gui  pip3.8       tqdm
activate.fish  easy_install      pdf2txt.py   __pycache__
Activate.ps1   easy_install-3.8  pip          python
(venv) frohro@frohro-A475:~/Downloads/venv/bin$ sudo updatedb
[sudo] password for frohro: 
/usr/bin/find: '/run/user/1000/gvfs': Permission denied
(venv) frohro@frohro-A475:~/Downloads/venv/bin$ locate ocrmypdf
/home/frohro/.local/bin/ocrmypdf
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/INSTALLER
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/LICENSE
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/METADATA
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/RECORD
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/WHEEL
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/entry_points.txt
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf-11.6.2.dist-info/top_level.txt
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__main__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/__main__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_concurrent.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_graft.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_jobcontext.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_logging.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_pipeline.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_plugin_manager.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_sync.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_unicodefun.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_validation.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/_version.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/api.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/cli.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/exceptions.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/helpers.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/hocrtransform.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/leptonica.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/optimize.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/pdfa.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/pluginspec.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/__pycache__/quality.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_concurrent.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/ghostscript.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/jbig2enc.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/pngquant.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/tesseract.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/__pycache__/unpaper.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/ghostscript.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/jbig2enc.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/pngquant.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/tesseract.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_exec/unpaper.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_graft.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_jobcontext.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_logging.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_pipeline.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_plugin_manager.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_sync.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_unicodefun.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_validation.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/_version.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/api.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/__pycache__/ghostscript.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/__pycache__/tesseract_ocr.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/ghostscript.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/builtin_plugins/tesseract_ocr.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/cli.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/data
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/data/sRGB.icc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/exceptions.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/helpers.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/hocrtransform.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/leptonica.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/__pycache__/_leptonica.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/__pycache__/compile_leptonica.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/_leptonica.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/lib/compile_leptonica.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/optimize.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfa.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/__pycache__/info.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/__pycache__/layout.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/info.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pdfinfo/layout.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/pluginspec.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/py.typed
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/quality.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess/__init__.py
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess/__pycache__
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess/__pycache__/__init__.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess/__pycache__/_windows.cpython-38.pyc
/home/frohro/.local/lib/python3.8/site-packages/ocrmypdf/subprocess/_windows.py
(venv) frohro@frohro-A475:~/Downloads/venv/bin$ 
jbarlow83 commented 3 years ago

Here's a problem:

(venv) frohro@frohro-A475:~/Downloads$ pip3 install ocrmypdf
Requirement already satisfied: ocrmypdf in /home/frohro/.local/lib/python3.8/site-packages (11.6.2)

That means, the venv was configured to reference your user packages, and because you installed ocrmypdf in user packages already didn't. So the venv didn't give you an isolated environment.

But the site-packages listing is interesting.

(I think you may have create a --system-site-packages venv Read here for general background on what's going on with system packages, user packages and venvs: https://opensource.com/article/19/4/managing-python-packages)

You could also skip pip entirely and use apt install ocrmypdf. It's an older version but maybe it's enough.

frohro commented 3 years ago

Here is what that file has in it:

#!/usr/bin/python3
# -*- coding: utf-8 -*-
import re
import sys
from ocrmypdf.__main__ import run
if __name__ == '__main__':
    sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0])
    sys.exit(run())

Note the coding utf-8- business is being formatted in a strange way by github's editor here. If I install ocrmypdf using apt install it has the very same problem as a pip3 installed version.

jbarlow83 commented 3 years ago

Sorry this continues to be trouble. It so happened while researching a different topic today I found out the reasons behind the trouble you're having (and many others I think). https://gist.github.com/tiran/2dec9e03c6f901814f6d1e8dad09528e

In short Debian and Ubuntu customize Python in ways that are not standard, break core assumptions and lead to a poor user experience. The comments have some suggestions, if you want to give Ubuntu Python another chance.

Perhaps try using the ocrmypdf Docker image? https://ocrmypdf.readthedocs.io/en/latest/docker.html

jbarlow83 commented 3 years ago

As an aside, if you surround a block of code or anything you want to be treated literally with triple backticks on Github it will turn off formatting. https://docs.github.com/en/github/writing-on-github/basic-writing-and-formatting-syntax#quoting-code

jbarlow83 commented 2 years ago

Closing due to old version