zendframework / zend-http

Http component from Zend Framework
BSD 3-Clause "New" or "Revised" License
134 stars 85 forks source link

Incorrect gzip handler with Zend client adapter #89

Open pedigree opened 8 years ago

pedigree commented 8 years ago

I could be wrong but please do let me know if this is "as intended" as it seems a bit strange to me.

Reported from PHP 5.6 with curl, and Zend Framework 3.0.0

If I use this test code to pull content from (eg example.com)

        $config = ['adapter' => 'Zend\Http\Client\Adapter\Curl',];
        $client = new ZendHttpClient($uri, $config);
        $client->send();
        $response = $client->getResponse();
        $this->assertContains("Domain", $response->getContent());

then the client generates a header with

Accept-Encoding: gzip, deflate but this will assert with

Failed asserting that Binary String: 0x1f8b08003b81055200038d5441afd ... If you look at the pcap for the request, this is the raw gzip result (starts at 0x01bb)


        0x0030:  1324 208b 4854 5450 2f31 2e31 2032 3030  .$..HTTP/1.1.200
        0x0040:  204f 4b0d 0a43 6f6e 7465 6e74 2d45 6e63  .OK..Content-Enc
        0x0050:  6f64 696e 673a 2067 7a69 700d 0a41 6363  oding:.gzip..Acc
        0x0060:  6570 742d 5261 6e67 6573 3a20 6279 7465  ept-Ranges:.byte
        0x0070:  730d 0a43 6163 6865 2d43 6f6e 7472 6f6c  s..Cache-Control
        0x0080:  3a20 6d61 782d 6167 653d 3630 3438 3030  :.max-age=604800
        0x0090:  0d0a 436f 6e74 656e 742d 5479 7065 3a20  ..Content-Type:.
        0x00a0:  7465 7874 2f68 746d 6c0d 0a44 6174 653a  text/html..Date:
        0x00b0:  2053 756e 2c20 3138 2053 6570 2032 3031  .Sun,.18.Sep.201
        0x00c0:  3620 3139 3a30 343a 3139 2047 4d54 0d0a  6.19:04:19.GMT..
        0x00d0:  4574 6167 3a20 2233 3539 3637 3036 3531  Etag:."359670651
        0x00e0:  2b67 7a69 7022 0d0a 4578 7069 7265 733a  +gzip"..Expires:
        0x00f0:  2053 756e 2c20 3235 2053 6570 2032 3031  .Sun,.25.Sep.201
        0x0100:  3620 3139 3a30 343a 3139 2047 4d54 0d0a  6.19:04:19.GMT..
        0x0110:  4c61 7374 2d4d 6f64 6966 6965 643a 2046  Last-Modified:.F
        0x0120:  7269 2c20 3039 2041 7567 2032 3031 3320  ri,.09.Aug.2013.
        0x0130:  3233 3a35 343a 3335 2047 4d54 0d0a 5365  23:54:35.GMT..Se
        0x0140:  7276 6572 3a20 4543 5320 2865 7772 2f31  rver:.ECS.(ewr/1
        0x0150:  3542 4429 0d0a 5661 7279 3a20 4163 6365  5BD)..Vary:.Acce
        0x0160:  7074 2d45 6e63 6f64 696e 670d 0a58 2d43  pt-Encoding..X-C
        0x0170:  6163 6865 3a20 4849 540d 0a78 2d65 632d  ache:.HIT..x-ec-
        0x0180:  6375 7374 6f6d 2d65 7272 6f72 3a20 310d  custom-error:.1.
        0x0190:  0a43 6f6e 7465 6e74 2d4c 656e 6774 683a  .Content-Length:
        0x01a0:  2036 3036 0d0a 436f 6e6e 6563 7469 6f6e  .606..Connection
        0x01b0:  3a20 636c 6f73 650d 0a0d 0a1f 8b08 003b  :.close........;
        0x01c0:  8105 5200 038d 5441 afd3 300c beef 5798  ..R...TA..0...W.
        0x01d0:  7201 695d f780 0753 d756 2040 e202 1ce0  r.i]...S.V.@....
        0x01e0:  c231 6bdc d55a 9394 24ed 36a1 f7df 71db  .1k..Z..$.6...q.
        0x01f0:  bdae e5ed 402b b58e 1d7f fe6c c749 9e49  ....@+.....l.I.I
        0x0200:  93fb 738d 507a 5565 8be4 f187 4266 0be0  ..s.PzUe....Bf..
        0x0210:  27f1 e42b cc3e 9f84 aa2b 844f 4609 d249  '..+.>...+.OF..I
        0x0220:  3468 17c3 1685 5e40 5e0a ebd0 a741 e38b  4h....^@^....A..
        0x0230:  7013 4094 4d8c a5f7 7588 bf1b 6ad3 e0a3  p.@.M...u...j...
        0x0240:  d11e b50f bbb0 01e4 c32a 0d3c 9e7c d485  .........*.<.|..
        0x0250:  df8e 50b7 90b4 5098 062d e1b1 36d6 4ffc  ..P...P..-..6.O.
        0x0260:  8f24 7d99 4a6c 29c7 b05f 2c81 3479 1255  .$}.Jl).._,.4y.U

I believe that if you send Accept headers including gzip and you get gzipped data back then the getContent() function should correctly handle gzipped data, returning the decoded content without having to override with

        $config = [
            'adapter'     => 'Zend\Http\Client\Adapter\Curl',
            'curloptions' => [
                CURLOPT_ENCODING => '',
            ],
        ];
        $client = new ZendHttpClient($uri, $config);
thomasvargiu commented 5 years ago

Removing the automatic accept-encoding header would be a BC Break.

You can do this:


$client = new ZendHttpClient($uri, $config);
$client->setHeaders([
    'accept-encoding' => 'identity',
]);
``` 
michalbundyra commented 5 years ago

@pedigree There is no issue imho. We have two methods you can get the response content:

What's wrong with this approach?

weierophinney commented 4 years ago

This repository has been closed and moved to laminas/laminas-http; a new issue has been opened at https://github.com/laminas/laminas-http/issues/8.