atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

segmentation fault(core dumped) Linux python #391

Open vinayGummadavelly opened 4 years ago

vinayGummadavelly commented 4 years ago

import camelot camelot.read_pdf('file_name.pdf',pages='all', flavor='lattice') when i tried to run the command in LINUX machine i am facing a issues like 'Segmentation fault ' SIGSEGV error... could some one help me out ?

vinayGummadavelly commented 4 years ago

@vinayak-mehta could you please have look , if you get a chance

scheiblr commented 4 years ago

Hey guys, thanks for your promising tool. Unfortunately, I'm not able to test it, as I also getting a seg fault on Linux (Manjaro), just Tried this:

import camelot

tables = camelot.read_pdf('test.pdf')

I attached test.pdf. I'm using Anaconda with python 3.7 and installed camelot in 0.7.3.

nadolsw commented 4 years ago

I'm also receiving the same issue when attempting to run python script from command line. Attempts to execute within Jupyter result in the kernel dying.

I have however been successful using the CLI as documented here. Ex: camelot --format csv --output foo.csv lattice input.pdf

Still hoping that someone can assist with the segmentation fault though.

UPDATE: Here are the segmentation fault details:

'>>> tables = camelot.read_pdf(pdf)' 'Fatal Python error: Segmentation fault' 'Current thread 0x00007f45e0b46740 (most recent call first): File "/opt/anaconda3/lib/python3.7/site-packages/camelot/ext/ghostscript/_gsprint.py", line 171 in init_with_args' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/ext/ghostscript/init.py", line 39 in init' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/ext/ghostscript/init.py", line 95 in Ghostscript' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/parsers/lattice.py", line 220 in _generate_image' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/parsers/lattice.py", line 403 in extract_tables' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/handlers.py", line 172 in parse' ' File "/opt/anaconda3/lib/python3.7/site-packages/camelot/io.py", line 117 in read_pdf' ' File "", line 1 in ' 'Segmentation fault'

nadolsw commented 4 years ago

Success! I was able to resolve the segmentation error by specifying which 'flavor' to use when reading the table (lattice vs. stream).

As an example: tables = camelot.read_pdf(pdf, flavor='stream')

Binhao-Wang commented 4 years ago

Thank you @nadolsw ! It really works!

Binhao-Wang commented 4 years ago

But it seems the flavor='lattice' still doesn't work. Anyone helps?

raniphore commented 2 years ago

Any idea how to fix this ? I need the lattice flavor for parsing the pdf.

optimizasean commented 2 years ago

I see this problem running Ubuntu 20.04 as well. lattice flavor always results in seg fault although with much less info than the above. I just get Segmentation fault (core dumped) when I use lattice either through cmd, python shell, or in a script.

Cassieyy commented 6 days ago

Any idea how to fix this ? I need the lattice flavor for parsing the pdf.

use conda install -c conda-forge camelot-py to reinstall camelot works for me!!!