aerogo / aero

:bullettrain_side: High-performance web server for Go (2016). New alpha (2024) with even better performance is currently in development at https://git.akyoto.dev/go/web
MIT License
572 stars 33 forks source link

gzip encoding forced even if request doesn't accept such encodings #2

Closed lrstanley closed 6 years ago

lrstanley commented 6 years ago

It looks as though when gzip is enabled in the configuration options, gzip is enabled blindly for all responses without properly checking the Accept-Encoding http headers.

Since there are many applications and utilities which do not fully implement the most common compression algorithms, I strongly urge adding in a check that only compresses when the client can accept it. If other compression algorithms are used by this library, I would recommend checking those too. I do not use this library myself, so I do not have first hand experience with what aero supports.

For example, curl by default doesn't do compression, and will only decompress gzip content when the compressed flag is specified (at least on my system.)

$ curl -q -v localhost
* Rebuilt URL to: localhost:80/
*   Trying ::1...
* Connected to localhost (::1) port 80 (#0)
> GET / HTTP/1.1
> Host: example.com
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Cache-Control: must-revalidate
< Content-Encoding: gzip
< Content-Length: 3885
< Content-Type: text/html; charset=utf-8
< Etag: 665b19f6de0a7de
< Referrer-Policy: no-referrer
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Date: Tue, 30 Jan 2018 13:27:01 GMT
< 
�      n���;ے�6�������HQ�Vw[R���Ɠ��I��N*�2�G"� @�ԲJ���_��+I]�v��}�NE�����Hϟ�ۏ����OߢRUt1׿�b����E�y �X�Q�(��l
                                    �E�����3�~�B-9%|>�@�
F
  Wp�  lj.T�r�0umH����$��t�"�TM�S�%�F��;�H��3
�[�Q�V2�V���_J@&
                   �x�/Q�E�g����.
�Y����
<8ZW
[...]

Using the --compressed curl flag:

[core] [~]# curl -v --compressed localhost
* Rebuilt URL to: localhost:80/
*   Trying ::1...
* Connected to localhost (::1) port 80 (#0)
> GET / HTTP/1.1
> Host: example.com
> User-Agent: curl/7.47.0
> Accept: */*
> Accept-Encoding: deflate, gzip
> 
< HTTP/1.1 200 OK
< Cache-Control: must-revalidate
< Content-Encoding: gzip
< Content-Length: 3885
< Content-Type: text/html; charset=utf-8
< Etag: 665b19f6de0a7de
< Referrer-Policy: no-referrer
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Date: Tue, 30 Jan 2018 13:30:51 GMT
< 
<!DOCTYPE html><html lang='en'><head><title>
[...]

An example of gunzip'ing the output, when curl isn't told it should be requesting compression..

[core] [~]# curl -v localhost | gunzip       
* Rebuilt URL to: localhost:80/
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying ::1...
* Connected to localhost (::1) port 80 (#0)
> GET / HTTP/1.1
> Host: example.com
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Cache-Control: must-revalidate
< Content-Encoding: gzip
< Content-Length: 3885
< Content-Type: text/html; charset=utf-8
< Etag: 665b19f6de0a7de
< Referrer-Policy: no-referrer
< X-Content-Type-Options: nosniff
< X-Xss-Protection: 1; mode=block
< Date: Tue, 30 Jan 2018 13:31:57 GMT
< 
{ [3885 bytes data]
100  3885  100  3885    0     0  4857k      0 --:--:-- --:--:-- --:--:-- 3793k
* Connection #0 to host localhost left intact
<!DOCTYPE html><html lang='en'><head><title>
[...]

At this time, it's unlikely servers running aero will properly work with legacy/older/more basic clients.

akyoto commented 6 years ago

Yes, this didn't receive much attention when I developed it because there aren't many http clients without gzip support that I really cared about. It will be fixed if the performance overhead for the check is negligible.

lrstanley commented 6 years ago

There are still a huge list of popular http clients which will not decompress gzip (or other compression methods) when gzip is enabled without requesting it.

The check should be negligible. At minimum, this should be documented for those who enable the option without knowing that the library incorrectly handles gzip compression. When even curl cannot properly parse a request, that seems like a huge issue. If those are the types of http clients that you don't "really care about", then this library may not be suited for anything production related.

akyoto commented 6 years ago

I think it's an overstatement to say curl support is "important". However, I can see where you're coming from.

lrstanley commented 6 years ago

I don't think curl support is "important". Rather, there should be no favoritism between clients. curl is one of the most popular cli-based http clients, especially for anything API related. It's one of the first used tools for many developers against an http endpoint. Even moreso in the event of troubleshooting (how I ended up finding this issue).

If this library is meant to only support specific clients or a very specific purpose (reminds me of the "this website was built for IE" days..), I would recommend updating the README to warn people of this.

akyoto commented 6 years ago

I personally find it unacceptable for clients to not automatically apply gzip decompression when they actually support it. One of the main software quality factors is the ability to be robust, that includes auto-correction. Even if I send "no gzip" to a server and the server responds with a gzip response, the client should not be stubborn and just decode it automatically (curl supports it via flag, so why isn't this automatically applied?).

Nonetheless I see your point. I'll try to fix it once I find some free time.

Thanks for the detailed bug report.

akyoto commented 6 years ago

This is now fixed in 28ffbafb88a643f991ab915df0e685809fd867f9.

Thank you!