py-pdf / pypdf_table_extraction

A Python library to extract tabular data from PDFs
https://camelot-py.readthedocs.io
MIT License
15 stars 7 forks source link

ModuleNotFoundError: No module named 'ghostscript', but ghostscript installed #14

Closed halloleo closed 1 week ago

halloleo commented 3 months ago

When doing a lattice conversion I get the following error:

  File "/my/venv/path/lib/python3.8/site-packages/camelot/backends/ghostscript_backend.py", line 34, in convert
    import ghostscript
ModuleNotFoundError: No module named 'ghostscript'

Steps to reproduce the bug

Steps used to install camelot

  1. brew install ghostscript
  2. git clone https://github.com/py-pdf/pypdf_table_extraction.git
  3. pip install ".[base]"
  4. pip install opencv-python

Steps to be used to reproduce behavior:

  1. Run camelot -f csv -o out.csv lattice in.pdf

Expected behavior

A atble in out.csv Environment

Additional context

I checked in the Python REPL of the pypdf-table-extraction venv:

>>> from ctypes.util import find_library
>>> find_library("gs")
'/opt/homebrew/lib/libgs.dylib'

but

>>> import ghostscript
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'ghostscript'
foarsitter commented 3 months ago

pip install ghostscript will do the trick. [tool.poetry.group.*.dependencies] are not compatible with pip extras for what I known.

halloleo commented 2 months ago

This worked. thanks.