apache / trafficserver

Apache Traffic Server™ is a fast, scalable and extensible HTTP/1.1 and HTTP/2 compliant caching proxy server.
https://trafficserver.apache.org/
Apache License 2.0
1.8k stars 793 forks source link

Expected behaviour of the Compress Plugin #8720

Open heavygale opened 2 years ago

heavygale commented 2 years ago

According to our understanding of the documentation, the compress plugin should deliver requests for the content types configured as compressible to browsers that send the corresponding request header (Accept-Encoding: gzip) in compressed form without exception.

Using the access.log, we were able to determine that this is not the case, and that individual resources are delivered uncompressed. For this evaluation, we added the following fields to the log format and were able to determine that about 3% of requests to URLs ending in .js are currently delivered uncompressed to browsers that accept gzip: %<{Accept-Encoding}cqh> %<{Content-Encoding}psh> %<{Content-Type}psh>

According to our compress.config all .js resources should be compressed:

compressible-content-type application/javascript*
compressible-content-type application/x-javascript*
compressible-content-type application/json*

We also noticed that some resources are sent to the requesting browser sometimes compressed and sometimes uncompressed.

=> What is the expected behavior for the compress plugin? Should all compressible resources be delivered only in compressed form if the compression is accepted by the browser? Or does the plugin act according to the best effort principle, so that it is to be expected that individual resources are delivered uncompressed?

We are currently using version 8.1.3 of Apache Traffic Server on RHEL 7.9.

bryancall commented 2 years ago

The plugin should compress if the client is able to accept the type of encoding. Can you post an example log line? Please remove the client IP from the log.

heavygale commented 2 years ago

The following data were collected by looking at the access.log for about 30 minutes (~500k requests) and filtering for vendor.js, a file that was delivered both compressed and uncompressed during this time period:
uncompressed: grep '\[gzip|-'

     37 TCP_REFRESH_MISS/200 http://❎❎❎:80/❎❎❎/vendor.js

Example log lines:

2022-03-18 12:24:00 44 ❎❎❎:32546 TCP_REFRESH_MISS/200 1650604 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|-|application/javascript; charset=UTF-8]
2022-03-18 12:24:06 24 ❎❎❎:32558 TCP_REFRESH_MISS/200 1650605 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|-|application/javascript; charset=UTF-8]
2022-03-18 12:24:33 117 ❎❎❎:57040 TCP_REFRESH_MISS/200 1650604 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|-|application/javascript; charset=UTF-8]


compressed: grep '\[gzip|gzip'

    190 TCP_MISS/200 http://❎❎❎:80/❎❎❎/vendor.js
     29 TCP_REFRESH_MISS/200 http://❎❎❎:80/❎❎❎/vendor.js
      3 ERR_CLIENT_READ_ERROR/200 http://❎❎❎:80/❎❎❎/vendor.js
      2 TCP_MEM_HIT/200 http://❎❎❎:80/❎❎❎/vendor.js

Example log lines:

2022-03-18 12:22:52 169 ❎❎❎:50823 TCP_MISS/200 482453 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript cookieConsent=❎❎❎ Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0 [gzip|gzip|application/javascript; charset=UTF-8]
2022-03-18 12:22:53 132 ❎❎❎:39228 TCP_MISS/200 482573 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript cookieConsent=❎❎❎ Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0 [gzip|gzip|application/javascript; charset=UTF-8]
2022-03-18 12:22:54 146 ❎❎❎:65154 TCP_MISS/200 482411 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript cookieConsent=❎❎❎; cookieinfo=❎❎❎ Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0 [gzip|gzip|application/javascript; charset=UTF-8]

compresses log lines with TCP_REFRESH_MISS as found in uncomporessed:

2022-03-18 12:24:01 125 ❎❎❎:32532 TCP_REFRESH_MISS/200 482281 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|gzip|application/javascript; charset=UTF-8]
2022-03-18 12:24:29 96 ❎❎❎:55314 TCP_REFRESH_MISS/200 482449 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|gzip|application/javascript; charset=UTF-8]
2022-03-18 12:24:46 105 ❎❎❎:12734 TCP_REFRESH_MISS/200 482392 GET http://❎❎❎:80/❎❎❎/vendor.js - DIRECT application/javascript - Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Firefox/91.0 [gzip|gzip|application/javascript; charset=UTF-8]


Other files are also affected, vendor.js only serves as an example.

Common Log Format is set as follows: '%<cqtd> %<cqtt> %<ttms> %<{X-Forwarded-For}cqh>:%<{Remoteport}cqh> %<crc>/%<pssc> %<psql> %<cqhm> %<cquc> %<caun> %<phr> %<psct> %<{Cookie}cqh> %<{User-Agent}cqh> [%<{Accept-Encoding}cqh>|%<{Content-Encoding}psh>|%<{Content-Type}psh>]'

Confidential information has been replaced by ❎❎❎.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. Marking it stale to flag it for further consideration by the community.