jjlee / mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
http://wwwsearch.sourceforge.net/mechanize/
618 stars 123 forks source link

ParseError on unexpected name tokens #81

Open malexmave opened 11 years ago

malexmave commented 11 years ago

Hello,

if I parse a site with a somewhat strange name token in an input of a form (don't look at me, the guys who wrote the page are pretty horrible), I get a parse error:

Python 2.7.3 (default, Sep 26 2012, 21:51:14) 
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import mechanize
>>> br = mechanize.Browser()
>>> br.open("https://www.stine.uni-hamburg.de/")
<response_seek_wrapper at 0x28644d0 whose wrapped object = <closeable_response at 0x28641b8 whose fp = <socket._fileobject object at 0x7ff8f912fc50>>>
>>> br.select_form(name="loginform")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_mechanize.py", line 499, in select_form
    global_form = self._factory.global_form
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_html.py", line 544, in __getattr__
    self.forms()
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_html.py", line 557, in forms
    self._forms_factory.forms())
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_html.py", line 237, in forms
    _urlunparse=_rfc3986.urlunsplit,
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_form.py", line 844, in ParseResponseEx
    _urlunparse=_urlunparse,
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_form.py", line 981, in _ParseFileEx
    fp.feed(data)
  File "/usr/local/lib/python2.7/dist-packages/mechanize-0.2.5-py2.7.egg/mechanize/_form.py", line 760, in feed
    raise ParseError(exc)
mechanize._form.ParseError: expected name token at '<!$MG_SESSIONNO>" />'

Python 2.7.3 on Linux Mint 14 x64, although this will most likely not make a difference.

This bug may be related to #63 and / or #71.

Sincerely, malexmave