atlanhq / camelot

Camelot: PDF Table Extraction for Humans
https://camelot-py.readthedocs.io
Other
3.61k stars 349 forks source link

ZeroDivisionError: float division by zero #474

Open juthaip opened 2 years ago

juthaip commented 2 years ago

I have imported the pdf file in Thai language but I get the following error:

ZeroDivisionError Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_10952/542098112.py in ----> 1 tables = camelot.read_pdf('../comsci61.pdf',flavor='stream',table_regions=["234.35,48.59,435.24,560.16"]) 2 tables

~\anaconda3\envs\openD\lib\site-packages\camelot\io.py in read_pdf(filepath, pages, password, flavor, suppress_stdout, layout_kwargs, **kwargs) 111 p = PDFHandler(filepath, pages=pages, password=password) 112 kwargs = remove_extra(kwargs, flavor=flavor) --> 113 tables = p.parse( 114 flavor=flavor, 115 suppress_stdout=suppress_stdout,

~\anaconda3\envs\openD\lib\site-packages\camelot\handlers.py in parse(self, flavor, suppress_stdout, layout_kwargs, kwargs) 174 parser = Lattice(kwargs) if flavor == "lattice" else Stream(**kwargs) 175 for p in pages: --> 176 t = parser.extract_tables( 177 p, suppress_stdout=suppress_stdout, layout_kwargs=layout_kwargs 178 )

~\anaconda3\envs\openD\lib\site-packages\camelot\parsers\stream.py in extract_tables(self, filename, suppress_stdout, layout_kwargs) 454 return [] 455 --> 456 self._generate_table_bbox() 457 458 _tables = []

~\anaconda3\envs\openD\lib\site-packages\camelot\parsers\stream.py in _generate_table_bbox(self) 308 hor_text.extend(region_text) 309 # find tables based on nurminen's detection algorithm --> 310 table_bbox = self._nurminen_table_detection(hor_text) 311 else: 312 table_bbox = {}

~\anaconda3\envs\openD\lib\site-packages\camelot\parsers\stream.py in _nurminen_table_detection(self, textlines) 285 self.textedges.extend(relevant_textedges) 286 # guess table areas using textlines and relevant edges --> 287 table_bbox = textedges.get_table_areas(textlines, relevant_textedges) 288 # treat whole page as table area if no table areas found 289 if not len(table_bbox):

~\anaconda3\envs\openD\lib\site-packages\camelot\core.py in get_table_areas(self, textlines, relevant_textedges) 219 ) 220 table_areas[updated_area] = None --> 221 average_textline_height = sum_textline_height / float(len(textlines)) 222 223 # add some padding to table areas

ZeroDivisionError: float division by zero

How should I solve the problem?