gesomax / httplib2

Automatically exported from code.google.com/p/httplib2
0 stars 0 forks source link

Error -3 while decompressing data: incorrect header check #69

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. Subscribe to http://tomayko.com/feed with Venus

What is the expected output? What do you see instead?

The feed.  An exception:

>> 1252517351.650019 ERROR Error processing http://tomayko.com/feed
>> 1252517351.651100 ERROR error: Error -3 while decompressing data:
incorrect header check
>> 1252517351.651241 ERROR   File "/home/rubys/bzr/venus/planet/spider.py",
line 316, in httpThread
>>     (resp, content) = h.request(idna, 'GET', headers=headers)
>> 1252517351.651324 ERROR   File
"/home/rubys/bzr/venus/planet/vendor/httplib2/__init__.py", line 1071, in
request
>>     (response, new_content) = self._request(conn, authority, uri,
request_uri, method, body, headers, redirections, cachekey)
>> 1252517351.651405 ERROR   File
"/home/rubys/bzr/venus/planet/vendor/httplib2/__init__.py", line 887, in
_request
>>     (response, content) = self._conn_request(conn, request_uri, method,
body, headers)
>> 1252517351.651485 ERROR   File
"/home/rubys/bzr/venus/planet/vendor/httplib2/__init__.py", line 873, in
_conn_request
>>     content = _decompressContent(response, content)
>> 1252517351.651564 ERROR   File
"/home/rubys/bzr/venus/planet/vendor/httplib2/__init__.py", line 351, in
_decompressContent
>>     content = zlib.decompress(content)

Please use labels and text to provide additional information.

Original issue reported on code.google.com by sa3ruby@gmail.com on 9 Sep 2009 at 9:50

GoogleCodeExporter commented 8 years ago
I personally think this should throw a httplib2.FailedToDecompressContent 
instead.
I'm attaching a patch that fixes this.

Original comment by zelo.z...@gmail.com on 10 Sep 2009 at 7:32

Attachments:

GoogleCodeExporter commented 8 years ago
Considering that the only headers that I pass are If-None-Match and
If-Modified-Since, I'm unhappy with "ERROR HttpLib2Error: Content purported to 
be
compressed with deflate but failed to decompress" being the result.

Original comment by sa3ruby@gmail.com on 10 Sep 2009 at 11:48

GoogleCodeExporter commented 8 years ago
While you only pass If-None-Match and If-Modified-Since, httplib2 also 
automatically
adds "Accept-Encoding: deflate, gzip" if you don't set Accept-Encoding yourself 
--
see http://code.google.com/p/httplib2/source/browse/httplib2/__init__.py#1002

Obviously, the returned content has the 'Content-Encoding: deflate' header if 
you get
this error.

Original comment by zelo.z...@gmail.com on 10 Sep 2009 at 12:17

GoogleCodeExporter commented 8 years ago
I'm suggesting that if httplib2 automatically adds the Accept-Encoding, it 
shouldn't
raise an exception if the request would have succeeded but for this addition.

Original comment by sa3ruby@gmail.com on 10 Sep 2009 at 1:11

GoogleCodeExporter commented 8 years ago
I agree, I think the FailedToDecompressContent should be handled internally by 
httplib2.

However, this was already discussed in issue #49 which is now closed wontfix.

Original comment by zelo.z...@gmail.com on 10 Sep 2009 at 1:26

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
httplib2 v0.4 used 'compress, gzip' as its default encoding where v0.5 changed 
this
to 'deflate, gzip'. It did no see this error in the past but now about five of 
500
urls are mangled (first two chars of content are missing).
Changing the order to 'gzip, deflate' is an easy work around (which effectively
avoids deflate-encoded data on the broken servers of my list).
I think this would be a nice default?
Further, a also think that this error should raise FailedToDecompressContent 
and it
should be left up to the application to re-request of fail.

Original comment by benjamin...@gmail.com on 14 Sep 2009 at 8:55

GoogleCodeExporter commented 8 years ago
As a quick first fix I've reversed the order of gzip and deflate in the request 
header, which does fix the reported problem. 

The longer term fix should be to do-the-right-thing, and redo the request with 
the 
Accept-Encoding header dropped if the decompression fails. But maybe that 
should only 
happen if httplib2 added the Accept-Encoding header itself, but raise an 
exception if 
it was a user added Accept-Encoding header?

Leaving open for now until a final fix is in place.

Original comment by joe.gregorio@gmail.com on 29 Sep 2009 at 9:47

GoogleCodeExporter commented 8 years ago
This bug is still present in httplib2-0.6, changing the accept range still 
works. I
assume this to be a bug in the webserver/load balancer, for instance, "Sandpiper
Footprint http load balancer 4.5" at www.zdf.de is affected.

Original comment by benjamin...@gmail.com on 27 Jan 2010 at 12:30

GoogleCodeExporter commented 8 years ago
Issue 95 has been merged into this issue.

Original comment by joe.gregorio@gmail.com on 14 May 2010 at 4:02

GoogleCodeExporter commented 8 years ago
This bug is most likely due to server errors. The HTTP spec for 
Content-Encoding is unclear on raw "deflate" vs "zlib" format. MSIE<6 had a bug 
where it only accepted raw deflate format, instead of the actually mandated 
zlib stream. Hence, many servers also implement it wrong.

Therefore, when decoding "deflate" you have to try both, zlib and deflate 
format, like so:

# deflate support
import zlib
def deflate(data):   # zlib only provides the zlib compress format, not the 
deflate format;
  try:               # so on top of all there's this workaround:
    return zlib.decompress(data, -zlib.MAX_WBITS)
  except zlib.error:
    return zlib.decompress(data)

Original comment by xmi...@gmail.com on 4 Aug 2010 at 12:37

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
I also receive this error when trying a jobsearch page on Monster:
       http://jobsearch.monster.co.uk/Search.aspx?q=marketing&cy=uk&lid=193

    response, content = h.request(url)
  File "C:\Python27\lib\site-packages\httplib2\__init__.py", line 1071, in request
    (response, new_content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "C:\Python27\lib\site-packages\httplib2\__init__.py", line 887, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "C:\Python27\lib\site-packages\httplib2\__init__.py", line 873, in _conn_request
    content = _decompressContent(response, content)
  File "C:\Python27\lib\site-packages\httplib2\__init__.py", line 351, in _decompressContent
    content = zlib.decompress(content)
zlib.error: Error -3 while decompressing data: unknown compression method

Does work with a try with except:
   response, content = h.request(url, headers={'accept-encoding': 'identity'})

Original comment by simo...@gmail.com on 30 Oct 2010 at 2:24

GoogleCodeExporter commented 8 years ago
Issue 82 has been merged into this issue.

Original comment by joe.gregorio@gmail.com on 15 Feb 2011 at 3:16

GoogleCodeExporter commented 8 years ago
I also faced same issue while trying out httplib2 package of python.
Initially i tried with `'accept-encoding':'gzip, deflate'`. Even i tried the 
change of order of `gzip` & `deflate`. It did not work for me. Then, i removed 
`deflate` and it worked like a charm for me.
So, i used `'accept-encoding': 'gzip'` that is all. This worked!

Original comment by aashish....@gmail.com on 6 Jun 2012 at 12:35