johnlinp / pdf-to-markdown

Convert PDF files into markdown files
BSD 3-Clause "New" or "Revised" License
284 stars 70 forks source link

IndexError: list index out of range (reopening of #15) #20

Open aleksas opened 5 years ago

aleksas commented 5 years ago

Getting IndexError: list index out of range (see bellow) when converting THIS PDF. Reopening of #15.

Parsing Anbinderis_2010.pdf
Traceback (most recent call last):
  File "/usr/local/bin/pdf2md", line 32, in <module>
    main(sys.argv)
  File "/usr/local/bin/pdf2md", line 27, in main
    writer.write(piles)
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/writer.py", line 27, in write
    self._write_simple(piles)
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/writer.py", line 50, in _write_simple
    markdown = pile.gen_markdown(self._syntax)
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/pile.py", line 76, in gen_markdown
    return self._gen_table_markdown(syntax)
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/pile.py", line 290, in _gen_table_markdown
    intermediate = self._gen_table_intermediate()
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/pile.py", line 319, in _gen_table_intermediate
    bottom, rowspan = self._find_exist_coor(left, right, row_idx, horizontal_coor, 'horizontal')
  File "/usr/local/lib/python2.7/dist-packages/pdf2md/pile.py", line 357, in _find_exist_coor
    coor = line_coor[start_idx + span]
IndexError: list index out of range