Closed outoftime closed 8 years ago
Have similar problem while working with https://www.googleapis.com/
Is it possible disable compression support in http-em-client?
@outoftime you're advertising "compressed" encoding, which em-http does not support and hence fails to decode when it receives it. Update your header to "Accept-Encoding"=>"gzip"
and I believe that should fix it.
@ayanko yes, you can pass in :decoding => false
to disable auto-decoding. See: https://github.com/igrigorik/em-http-request/blob/7b8d47c40191758f7df565625703727982fcd635/spec/client_spec.rb#L434
Thx!
So this 69 faraday adapter line should be corrected at least:
options[:head]['accept-encoding'] = 'gzip'
Closing, please feel free to reopen if there is more to do here.. It'd be nice to figure out why we get the error, but we need a reproducible test case.
This should be reopened. I tried the code snippet in the first comment with a single encoding at a time ("Accept-Encoding"=>"gzip"
, "Accept-Encoding"=>"compressed"
and "Accept-Encoding"=>"deflate"
), and there's definitely a problem with gzip
.
The error is not always the same: Zlib::DataError: invalid code lengths set
, Zlib::DataError: invalid stored block lengths
, Zlib::DataError: invalid distance too far back
, etc.
Also, em-http-request does support the compressed
encoding.
I have encountered same problem.
Echoing the concerns above. In my case, no additional headers are needed to see this behavior. Just a standard request to https://www.google.com/
triggers this problem with the error Zlib::DataError: invalid block type
Hi! 👋
I also have this issue, and used git bisect
to find that the problem was introduced with this commit (https://github.com/igrigorik/em-http-request/commit/6270932b9bbcd88f2ffd8472be26f79117779695).
Here's my reproduction script (based on @outoftime's):
require 'eventmachine'
require_relative 'lib/em-http-request'
100.times do
EventMachine.run do
request = EventMachine::HttpRequest.new('https://www.googleapis.com/')
.get(head: { 'Accept-Encoding' => 'gzip' })
request.callback { EventMachine.stop }
request.errback do
exit! if request.error == 'Content-decoder error'
EventMachine.stop
end
end
end
Here's my git bisect
command:
$ git bisect start HEAD v1.0.2
$ git bisect run $SHELL -c "rm -rf vendor .bundle Gemfile.lock && bundle install --without development && bundle exec ruby elgoog.rb"
Here's the output:
6270932b9bbcd88f2ffd8472be26f79117779695 is the first bad commit
commit 6270932b9bbcd88f2ffd8472be26f79117779695
Author: Martin Ottenwaelter <martin.ottenwaelter@gmail.com>
Date: Fri Nov 16 19:05:36 2012 +0100
manually extract the stream from gzip files to fix #204
:040000 040000 8e43d196aaac13327a39571a9564bc9925b7d153 ac5ee32f99b6cbe71ff2d3b540bf913eef483e6b M lib
:040000 040000 331b78ef3247f15ae362596dc40e877c8964a4fe a3c8ec13c1fb77aefa7dfcbab48f2af5019e7994 M spec
Hope that helps! I'll do some more investigation myself - but thought sharing my findings may help somebody else to diagnose the problem further.
I've investigated this a little further and found that it seems to be related to the way that the stream of bytes is delivered to the gzip decoder.
To find this I captured the chunks of data from the gzipped response body of a GET request to https://www.googleapis.com/ ("Not Found"
) in both the successful and error cases.
Here's a script that demonstrates the issue:
require 'eventmachine'
require_relative 'lib/em-http/decoders'
GOOD = ["\x1F", "\x8B", "\b", "\x00", "\x00", "\x00", "\x00", "\x00", "\x00", "\x00", "\xF3", "\xCB", "/", "Q", "p\xCB/\xCDK\x01\x00M\x8Ct\xB1\t\x00\x00\x00"]
BAD = ["\x1F", "\x8B", "\b", "\x00", "\x00", "\x00", "\x00", "\x00\x00\x00\xF3\xCB/Qp\xCB/\xCDK\x01\x00M\x8Ct\xB1\t\x00\x00\x00"]
def decode(stream)
decoder = EventMachine::HttpDecoders::GZip.new {}
stream.each { |bytes| decoder << bytes }
end
puts 'Good stream - will succeed'
decode GOOD
puts 'Bad stream - will error'
decode BAD
We're observing an intermittent error (~40% of requests) when making a request to
https://www.google.com
with compressed responses accepted. Here's a reduction:Note that the
User-Agent
header is needed to reproduce the problem; without it, the error is not in evidence.When making requests to Google with curl, using the same headers, I haven't observed the problem, so I think the problem is specific to
em-http-request
or one of its dependencies.