johnlinp / pdf-to-markdown

Convert PDF files into markdown files
BSD 3-Clause "New" or "Revised" License
285 stars 69 forks source link

Exception on LTLine and LTChar #3

Closed bmaggi closed 8 years ago

bmaggi commented 9 years ago

See the following exceptions :

Traceback (most recent call last):
  File "main.py", line 30, in <module>
    main(sys.argv)
  File "main.py", line 17, in main
    piles = parser.parse()
  File "/Users/bma/git/pdf-to-markdown/pdf2md/parser.py", line 36, in parse
    piles += self._parse_page(page)
  File "/Users/bma/git/pdf-to-markdown/pdf2md/parser.py", line 60, in _parse_page
    pile.parse_layout(page)
  File "/Users/bma/git/pdf-to-markdown/pdf2md/pile.py", line 55, in parse_layout
    assert False, "Unrecognized type: %s" % type(obj)
AssertionError: Unrecognized type: <class 'pdfminer.layout.LTLine'>
Traceback (most recent call last):
  File "main.py", line 30, in <module>
    main(sys.argv)
  File "main.py", line 17, in main
    piles = parser.parse()
  File "/Users/bma/git/pdf-to-markdown/pdf2md/parser.py", line 36, in parse
    piles += self._parse_page(page)
  File "/Users/bma/git/pdf-to-markdown/pdf2md/parser.py", line 60, in _parse_page
    pile.parse_layout(page)
  File "/Users/bma/git/pdf-to-markdown/pdf2md/pile.py", line 52, in parse_layout
    assert False, "Unrecognized type: %s" % type(obj)
AssertionError: Unrecognized type: <class 'pdfminer.layout.LTChar'>