atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

Getting PyPDF2 error while using the camelot library #493

Open shivambaldha1 opened 1 year ago

shivambaldha1 commented 1 year ago

recently PyPDF2 version was updated, and now while using the Camelot I am getting an error while I use the read_csv module,

image

please fix this issue.

LiuJeremy commented 1 year ago

pip uninstall PyPDF2===1.26.0, Camelot require PyPDF2>=1.26.0 as mentioned in requirements.txt

shivambaldha1 commented 1 year ago

yes, I know this thing but PyPDF has updated so PyPDF changes some of the functions.

ig3 commented 1 year ago

I just installed camelot using pip, according to the installation instructions, and I am getting the same error when I use camelot CLI:

$ camelot --output test.out -f json lattice cs1.pdf
Traceback (most recent call last):
  File "/home/ian/.local/bin/camelot", line 8, in <module>
    sys.exit(cli())
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/decorators.py", line 84, in new_func
    return ctx.invoke(f, obj, *args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/cli.py", line 204, in lattice
    tables = read_pdf(
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/io.py", line 113, in read_pdf
    tables = p.parse(
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/handlers.py", line 172, in parse
    self._save_page(self.filepath, p, tempdir)
  File "/home/ian/.local/lib/python3.9/site-packages/camelot/handlers.py", line 111, in _save_page
    infile = PdfFileReader(fileobj, strict=False)
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_reader.py", line 1974, in __init__
    deprecation_with_replacement("PdfFileReader", "PdfReader", "3.0.0")
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_utils.py", line 369, in deprecation_with_replacement
    deprecation(DEPR_MSG_HAPPENED.format(old_name, removed_in, new_name))
  File "/home/ian/.local/lib/python3.9/site-packages/PyPDF2/_utils.py", line 351, in deprecation
    raise DeprecationError(msg)
PyPDF2.errors.DeprecationError: PdfFileReader is deprecated and was removed in PyPDF2 3.0.0. Use PdfReader instead.
ig3 commented 1 year ago

I solved the problem with:

$ pip uninstall PyPDF2
$ pip install PyPDF2==2.12.1

This rolls PyPDF2 back to version 2.12.1 which is the last version before 3.0.0 which deprecated many features according to the PyPDF2 Change Log

drubanov commented 1 year ago

Doesn't look like camelot has had any commits in 5 years. I wouldn't count on this being fixed unless someone does it in a fork. The current version should have requirements fixed to say PyPDF2>=1.26.0,<=2.12.1

davimmilhome commented 9 months ago

Same issue, illtry pip uninstall PyPDF2===1.26.0

deusdevok commented 7 months ago

You can always work with a virtual environment with PyPDF2==2.12.1 for example.

No need to uninstall the current version from your system.