jjlee / mechanize

Stateful programmatic web browsing in Python, after Andy Lester's Perl module WWW::Mechanize .
http://wwwsearch.sourceforge.net/mechanize/
618 stars 121 forks source link

Fix HTTPGzipProcessor #51

Open jasonkotenko opened 13 years ago

jasonkotenko commented 13 years ago

Hi, we've been using mechanize and noticed that if a page was gzipped, mechanize just chokes by spitting out a stacktrace. I found the HTTPGzipProcessor, but it appeared to be bit-rotted and did not work correctly with HTTPEquivProcessor, so I fixed it up to correctly subclass addinfourl, and also discard the content-encoding:gzip header after the content has been decompressed (or else clients try to decompress something that is already decompressed).

The version I'm sending a pull request for works for gzip and non-gzip pages, if you include the HTTPGzipProcessor in your OpenerDirector (in our shop we have our own custom OD).

Let me know if you have any concerns about the changes, or suggestions for how to do it better. I'll be writing up a blog post about this in the next few days at http://jasonkotenko.com.

Thanks, Jason Kotenko

jamesbroadhead commented 7 years ago

Thank you for your contribution to mechanize!

Following the process in #117, future work on mechanize will be occurring here: https://github.com/python-mechanize/mechanize.

Please re-file your PR there (where it will get attention, and hopefully merged)