jjlee / mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
http://wwwsearch.sourceforge.net/mechanize/
618 stars 123 forks source link

Traceback on unknown encoding #30

Closed jjlee closed 13 years ago

jjlee commented 13 years ago

To reproduce: import mechanize import mechanize._response

response = mechanize._response.test_response(
    "<",
    headers=[("Content-type", "text/html; charset=\"bogus\"")])
browser = mechanize.Browser()
browser.set_response(response)
browser.forms()

Expect: no traceback (falls back to default encoding)

Got: Traceback (most recent call last): File "/home/john/dev/tst.py", line 93, in browser.forms() File "/home/john/dev/mechanize/mechanize/_mechanize.py", line 420, in forms return self._factory.forms() File "/home/john/dev/mechanize/mechanize/_html.py", line 549, in forms self._forms_factory.forms()) File "/home/john/dev/mechanize/mechanize/_html.py", line 229, in forms _urlunparse=_rfc3986.urlunsplit, File "/home/john/dev/mechanize/mechanize/_form.py", line 844, in ParseResponseEx _urlunparse=_urlunparse, File "/home/john/dev/mechanize/mechanize/_form.py", line 981, in _ParseFileEx fp.feed(data) File "/home/john/dev/mechanize/mechanize/_form.py", line 758, in feed _sgmllib_copy.SGMLParser.feed(self, data) File "/home/john/dev/mechanize/mechanize/_sgmllib_copy.py", line 110, in feed self.goahead(0) File "/home/john/dev/mechanize/mechanize/_sgmllib_copy.py", line 199, in goahead self.handle_entityref(name) File "/home/john/dev/mechanize/mechanize/_form.py", line 650, in handle_entityref '&%s;' % name, self._entitydefs, self._encoding)) File "/home/john/dev/mechanize/mechanize/_form.py", line 143, in unescape return re.sub(r"&#?[A-Za-z0-9]+?;", replace_entities, data) File "/usr/lib/python2.6/re.py", line 151, in sub return _compile(pattern, 0).sub(repl, string, count) File "/home/john/dev/mechanize/mechanize/_form.py", line 135, in replace_entities repl = repl.encode(encoding) LookupError: unknown encoding: bogus

jjlee commented 13 years ago

Fall back to another encoding if an unknown one is declared

Closed by 261c3ebaa98590cf58f7e62ddc9b7264e318b558