crossbario / crossbar

Crossbar.io - WAMP application router
https://crossbar.io/
Other
2.05k stars 274 forks source link

Support gzip content encoding in static Web #633

Open oberstet opened 8 years ago

oberstet commented 8 years ago

Support Content-Encoding:gzip in Crossbar.io static Web service.

Consider http://autobahn.s3.amazonaws.com/autobahnjs/latest/autobahn.min.jgz.

This is how it looks when Chromium fetches the file:

Request:
Request URL:http://autobahn.s3.amazonaws.com/autobahnjs/latest/autobahn.min.jgz
Request Method:GET
Status Code:200 OK
Remote Address:54.231.133.17:80

Response Headers:
view source
Accept-Ranges:bytes
Content-Encoding:gzip
Content-Length:49697
Content-Type:text/javascript
Date:Thu, 04 Feb 2016 15:11:46 GMT
ETag:"ee4f7254bd75ddbda779b6c4623aee79"
Last-Modified:Sat, 19 Dec 2015 14:24:07 GMT
Server:AmazonS3
x-amz-id-2:BaRCGM02hVj6n6t9Oc+hDdVh4ZYxqFROCSRNjV6F1hjqjPOXuX0esUoAeLOeqTgMP4Ohkn+fWyg=
x-amz-request-id:9EE43E680A8F9B00

Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4
Cache-Control:no-cache
Connection:keep-alive
Host:autobahn.s3.amazonaws.com
Pragma:no-cache
Referer:http://127.0.0.1:8080/
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36

And this is how it looks when Chromium tries to fetch (and fails to process the response) from Crossbar.io static Web services (that is Twisted Web):

Request:
Request URL:http://127.0.0.1:8080/autobahn.min.jgz
Request Method:GET
Status Code:200 OK
Remote Address:127.0.0.1:8080

Response Headers
view source
Accept-Ranges:bytes
Cache-Control:max-age=43200, public
Content-Length:49697
Content-Type:text/javascript
Date:Thu, 04 Feb 2016 15:12:39 GMT
Expires:Fri, 05 Feb 2016 03:12:39 GMT
Last-Modified:Sat, 19 Dec 2015 14:24:07 GMT
Server:Crossbar/0.13.0

Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, sdch
Accept-Language:de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4
Cache-Control:no-cache
Connection:keep-alive
Host:127.0.0.1:8080
Pragma:no-cache
Referer:http://127.0.0.1:8080/
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/48.0.2564.82 Chrome/48.0.2564.82 Safari/537.36
--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/30536316-support-gzip-content-encoding-in-static-web?utm_campaign=plugin&utm_content=tracker%2F462544&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F462544&utm_medium=issues&utm_source=github).
meejah commented 8 years ago

This sounds relevant: https://twistedmatrix.com/documents/current/web/howto/using-twistedweb.html#request-encoders

However, this is also relevant: https://en.wikipedia.org/wiki/BREACH_%28security_exploit%29 Unfortunately, I don't know that attack well enough to know when it's safe to combine TLS and compression...

oberstet commented 8 years ago

Thanks for the pointers!

Maybe this is actually 2 things:

1) pre-compressed files (eg autobahn.min.jgz is a file that is minimized JS that is GZipped) should be served wit correct MIME type and correct header for Content-Encoding. That is, the latter header should be automatically set. Which isn't the case as of today - which is my original itch;) 2) supporting transparent, automatic, on-the-fly compression - this seems related to the https://twistedmatrix.com/documents/current/web/howto/using-twistedweb.html#request-encoders stuff. This would also be nice - but then why compress on each and every request instead of just once? But regardless, it seems different from 1)

meejah commented 8 years ago

Okay, yes, merely setting MIME types etc sounds like a good idea (and not a security problem).

oberstet commented 8 years ago

For 1), the question hence is: is it possible to automatically set Content-Encoding headers in Twisted Web depending on file extension: "add gzip conent-enc header whenever file ext is gz"

goeddea commented 8 years ago

Ideally, the static Web server would first try to serve from cache (with configurable cache size). In cache, it would have both the original and compressed versions, and serve the smallest version that the requester indicates compatibility for.

@hawkowl - Is this possible using Twisted? If not, what are the missing parts?

oberstet commented 4 years ago

The new archive webservice supports configuring additional MIME types, so that files served from within the ZIP have correct HTTP response headers set.

Eg consider this config item:

{
    "type": "web",
    "endpoint": {
        "type": "tcp",
        "port": 8081,
        "backlog": 1024
    },
    "paths": {
        "/": {
            "type": "static",
            "directory": "../../web",
            "options": {
                "enable_directory_listing": true
            }
        },
        "autobahn": {
            "type": "archive",
            "archive": "autobahn.zip",
            "origin": "https://github.com/crossbario/autobahn-js-browser/archive/master.zip",
            "object_prefix": "autobahn-js-browser-master",
            "default_object": "autobahn.min.js",
            "download": true,
            "cache": true,
            "hashes": [
                "5ef1326e6f0f54e4552b5b5288d4dd2c96ad2e4164cd9e49886fe083fa5d8854"
            ],
            "mime_types": {
                ".min.js": "text/javascript",
                ".jgz": "text/javascript"
            }
        },
...

This correctly serves eg

<html>
    <meta charset="UTF-8">
    <body>
        <script>AUTOBAHN_DEBUG = false;</script>
        <script src="/autobahn/autobahn.min.js"></script>
        <script src="client.js"></script>
        <h1>Crossbar.io HA Demo - AutobahnJS Client</h1>
        <p>Open JavaScript console to watch output.</p>
    </body>
</html>

setting the HTTP response content type header to text/javascript.

however, it fails for /autobahn/autobahn.min.jgz

the reason is: we need to convince Twisted Web to not only set above header (which is done already), but in addition, compress the HTTP response (deflate)!

I'm not even sure Twisted Web supports that ..

https://en.wikipedia.org/wiki/HTTP_compression

the response must add header Content-Encoding: gzip and obviously actually compress the response byte stream