libo26 / feedparser

Automatically exported from code.google.com/p/feedparser
Other
0 stars 0 forks source link

python 3 ignores Content-Encoding header and doesn't decompress content #260

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
feedparser.parse('http://feedparser.org/docs/examples/atom10.xml') returns a 
bozo:
'bozo_exception': SAXParseException('not well-formed (invalid token)',)

feedparser.parse(urllib.request.urlopen('http://feedparser.org/docs/examples/ato
m10.xml')) works as expected.

I'm on Microsoft Vista Home.
Python version: 3.2
Feedparser version: 5.0.1

Original issue reported on code.google.com by james31415@gmail.com on 5 Mar 2011 at 11:45

GoogleCodeExporter commented 9 years ago
Fixed in r378.

Python 2 normalizes HTTP headers to lowercase, and Python 3 doesn't, but 
because there weren't any tests to exercise the content decompression code 
(gzip and zlib), the problem doesn't manifest in the unit tests. I was able to 
reproduce the problem and it's now fixed for me, but please download the very 
latest code [1] and try again. If this is still a problem, please revisit this 
report, or you run into another bug don't hesitate to open a new report!

[1]: https://feedparser.googlecode.com/svn/trunk/feedparser/feedparser.py

Original comment by kurtmckee on 5 Mar 2011 at 6:24