litespeedtech / openlitespeed

Our high-performance, lightweight, open source HTTP server
https://openlitespeed.org
GNU General Public License v3.0
1.16k stars 189 forks source link

Gzip encoding enforced for proxy #265

Open qmorek opened 3 years ago

qmorek commented 3 years ago

I have OLS server serving static content and python server behind it for dynamic content: httpd_config.conf:

extprocessor python-backend {
  type                    proxy
  address                 localhost:5000
  maxConns                2000
  initTimeout             10
  retryTimeout            2
  respBuffer              0
}

virtual_host.conf:

  context /proxied {
    type                  proxy
    handler               python-backend
    addDefaultCharset     off
  }

when I am getting content without any headers there is no compression:

curl -iv https://server_host -o curl_out 2>&1 | grep -i encoding

it is OK, setting header -H "Accept-Encoding: br":

curl -iv https://server_host -o curl_out -H "Accept-Encoding: br" 2>&1 | grep encoding
> accept-encoding: br
< content-encoding: br

so brotli works.... And now for proxied without header also there is no content-encoding, but with header:

curl -iv https://server_host/proxied -o curl_out -H "Accept-Encoding: br" 2>&1 | grep encoding
> accept-encoding: br
< content-encoding: gzip

so there is "gzip" instead of "br"

Could you please check what is going on here?

Checked on OpenLiteSpeed 1.7.13 version.

qmorek commented 2 years ago

@litespeedtech, can anyone have a look on this?

qmorek commented 2 years ago

It looks that for some reason OLS supports brotli only for static files, so I have added brotli compression middleware for my backend, so if client has 'Accept-Encoding' with 'br' it will compress it and return response with 'Content-Encoding: br'.

It is working fine for my purposes, but there is still one issue with it. For some reason OLS is enforcing gzip if 'Accept-Encoding' header is longer than 4 characters. https://github.com/litespeedtech/openlitespeed/blob/master/src/extensions/proxy/proxyconn.cpp#L258 so, if I use 'br' itself brotli is working fine, but for browser where 'gzip, deflate, br' is set by default brotli doesn't work as OLS is rewriting this header to 'gzip'..... It is unclear for me what is the reason behind that, but this header is set by client stating which encodings are supported by CLIENT, so server should not modify it. If server does not support encodings provided by client it can respond with 406 or 415

@litespeedtech can someone please check it

qmorek commented 2 years ago

Another thing is why 'Accept-Encoding' is set to default value if not provided: https://github.com/litespeedtech/openlitespeed/blob/master/src/extensions/proxy/proxyconn.cpp#L265 in such case our proxied backend receive 'Accept-Encoding: gzip' header even if client did not ask for it...

I have endpoint which returns back received headers, so when executing:

curl -v http://localhost/mirror/ok | jq | grep encoding

I get:

* Connected to localhost (127.0.0.1) port 80 (#0)
< content-type: application/json
< vary: Accept-Encoding
< content-length: 462
< date: Wed, 14 Sep 2022 16:42:56 GMT
< server: LiteSpeed
< x-robots-tag: noindex
< connection: Keep-Alive
<
{ [462 bytes data]
100   462  100   462    0     0  51333      0 --:--:-- --:--:-- --:--:-- 51333
* Connection #0 to host localhost left intact
      "accept-encoding": "gzip", 

so backend received 'Accept-Encoding: gzip' header and as I have middleware which is compressing with gzip also it returns to OLS compressed result... but OLS is not returning compressed content to curl... does it mean it is decompressing proxied backend response in such case?

litespeedtech commented 2 years ago

It is for saving bandwidth, faster transfer from backend and more. OLS will decompress automatically.

litespeedtech commented 2 years ago

Yeah, there are some extra work to support both br and gzip.

qmorek commented 2 years ago

Hi @litespeedtech, thank you for your answer. Is it somewhere on the roadmap?

I get idea of compressing communication with server and proxied backend if client not required it, in my case when backend is on the same machine I am not really sure if compressing/decompressing effort is worth in such case, maybe it could be configurable?

I don't get idea overwriting header when it is longer than 4 characters.... If I pass just 'br' in 'Accept-Encoding'

curl -v 'https://server.com/mirror/ok' -H 'accept: application/json' -H 'accept-encoding: br' | brotli -d - | jq | grep encoding
* Connected to server.com port 443 (#0)
* ALPN, offering http/1.1
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
> GET /mirror/ok HTTP/1.1
> Host: server.com
> User-Agent: curl/7.71.1-DEV
> accept: application/json
> accept-encoding: br
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-type: application/json
< content-encoding: br
< vary: Accept-Encoding
< content-length: 312
< date: Fri, 16 Sep 2022 12:13:51 GMT
< server: LiteSpeed
< x-robots-tag: noindex
< connection: Keep-Alive
<
{ [312 bytes data]
100   312  100   312    0     0    525      0 --:--:-- --:--:-- --:--:--   525
* Connection #0 to host server.com left intact
      "accept-encoding": "br",

so OLS is just passing brotli compressed binary content to client (I suppose the same happens when backend return gzipped response).... As for me the same would happen for any other 'Accept-Encoding' header longer than 4 characters. If adding brotli for dynamic content is not high priority for you, maybe it would be possible to at least get rid off this header overwrite?

Best Regards