jjlee / mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
http://wwwsearch.sourceforge.net/mechanize/
618 stars 121 forks source link

some simple error recovering support + image support #43

Open albertz opened 13 years ago

albertz commented 13 years ago

This failed earlier:

In [6]: br = mechanize.Browser()

In [8]: br.open("http://9-eyes.com")
Out[8]: <response_seek_wrapper at 0x101c165a8 whose wrapped object = <closeable_response at 0x101c19440 whose fp = <socket._fileobject object at 0x101c07d70>>>

In [9]: br.title()
---------------------------------------------------------------------------
ParseError                                Traceback (most recent call last)

/Users/az/Programmierung/9eyes-fetcher/<ipython console> in <module>()

/Users/az/Programmierung/mechanize/mechanize/_mechanize.pyc in title(self)
    458         if not self.viewing_html():
    459             raise BrowserStateError("not viewing HTML")
--> 460         return self._factory.title
    461 
    462     def select_form(self, name=None, predicate=None, nr=None):

/Users/az/Programmierung/mechanize/mechanize/_html.pyc in __getattr__(self, name)
    537         elif name == "title":
    538             if self.is_html:
--> 539                 self.title = self._title_factory.title()
    540             else:
    541                 self.title = None

/Users/az/Programmierung/mechanize/mechanize/_html.pyc in title(self)
    285                 return self._get_title_text(p)
    286         except sgmllib.SGMLParseError, exc:
--> 287             raise _form.ParseError(exc)
    288 
    289 

ParseError: expected name token at '<!<!DOCTYPE html PUB'

Now it works:

In [5]: br.title()
parser exception: expected name token at '<!<!DOCTYPE html PUB'
Out[5]: 'Jon Rafman'

The "parser exception" debug print here is commented out in the commit.


Also, I added image support. I.e. you can iterate over all img tags via Browser.images.

albertz commented 12 years ago

Ping. What about it?

jamesbroadhead commented 7 years ago

Thank you for your contribution to mechanize!

Following the process in #117, future work on mechanize will be occurring here: https://github.com/python-mechanize/mechanize.

Please re-file your PR there (where it will get attention, and hopefully merged)