aaronsw / html2text

Convert HTML to Markdown-formatted text.
http://www.aaronsw.com/2002/html2text/
GNU General Public License v3.0
2.61k stars 412 forks source link

bad end tag error #24

Closed eadmaster closed 12 years ago

eadmaster commented 12 years ago
> python html2text.py http://www.hardcoregaming101.net/
Traceback (most recent call last):
  File "C:\SharedPrograms\html2text\html2text.py", line 491, in <module>
    wrapwrite(html2text(data, baseurl))
  File "C:\SharedPrograms\html2text\html2text.py", line 450, in html2text
    return optwrap(html2text_file(html, None, baseurl))
  File "C:\SharedPrograms\html2text\html2text.py", line 445, in html2text_file
    h.feed(html)
  File "C:\PortableApps\Python\Python26\lib\HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "C:\PortableApps\Python\Python26\lib\HTMLParser.py", line 150, in goahead
    k = self.parse_endtag(i)
  File "C:\PortableApps\Python\Python26\lib\HTMLParser.py", line 314, in parse_endtag
    self.error("bad end tag: %r" % (rawdata[i:j],))
  File "C:\PortableApps\Python\Python26\lib\HTMLParser.py", line 115, in error
    raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: bad end tag: u'</a</p>', at line 416, column 113
aaronsw commented 12 years ago

This is a bug in Python's HTMLParser. You should probably file it upstream instead:

http://bugs.python.org/