bottlepy / bottle

bottle.py is a fast and simple micro-framework for python web-applications.
http://bottlepy.org/
MIT License
8.33k stars 1.46k forks source link

static_file is serving gzipped files as uncompressed #1350

Open jmlrt opened 2 years ago

jmlrt commented 2 years ago

When using static_file without mimetype to serve a file compressed with gzip, bottle seems to uncompress it on the fly.

How to reproduce it:

  1. create a gzipped text file:

    $ echo 'Hello, world!' > hello
    $ gzip hello
  2. serve the file with and without mimetype:

    >>> from bottle import route, run, static_file
    >>> @route("/static/<filename:path>")
    ... def send_static(filename):
    ...     return static_file(filename, root=".")
    ...
    >>> @route("/mimetype/<filename:path>")
    ... def send_static_with_mimetype(filename):
    ...     return static_file(filename, root=".", mimetype="application/x-gzip")
    ...
    >>> run(host="localhost", port=8080, debug=True)
    Bottle v0.12.19 server starting up (using WSGIRefServer())...
    Listening on http://localhost:8080/
    Hit Ctrl-C to quit.
  3. in a second shell, download the file using both endpoint and check its type:

    >>> import requests
    >>> import os
    >>> def dl_file(type):
    ...     url = f"http://localhost:8080/{type}/hello.gz"
    ...     response = requests.get(url)
    ...     path = type
    ...     if not os.path.exists(path):
    ...         os.makedirs(path)
    ...     else:
    ...         os.remove(f"{path}/hello.gz")
    ...     open(f"{path}/hello.gz", "wb").write(response.content)
    ...     os.system(f"file {path}/hello.gz")
    ... 
    >>> dl_file("static")
    static/hello.gz: ASCII text
    >>> dl_file("mimetype")
    mimetype/hello.gz: gzip compressed data, was "hello", last modified: Wed Sep 15 08:29:37 2021, from Unix, truncated
defnull commented 2 years ago

The file is not decompressed by bottle, but if the file-name looks stream-compressed (e.g. foo.txt.gz), bottle sets the Content-Encoding header in addition to Content-Type and most HTTP clients will decompress the body automatically. Not sure if that was a good idea in hindsight. HTTP is a mess sometimes. If you want to serve gzip files as is, you must explicitly set the mimetype parameter to bypass mime-guessing.

jmlrt commented 2 years ago

Hi @defnull, thanks for the quick answer and explanation.

bottle sets the Content-Encoding header in addition to Content-Type and most HTTP clients will decompress the body automatically.

Indeed requests is uncompressing the file while wget isn't which made me confused a lot 😄.

If you want to serve gzip files as is, you must explicitly set the mimetype parameter to bypass mime-guessing.

To be able to serve gzip and non-gzip files in the same route without Content-Encoding, detecting files mimetype and passing it to static_file seems fine:

from bottle import route, run, static_file
import mimetypes

@route("/static/<filename:path>")
def send_static(filename):
    mimetype = mimetypes.guess_type(filepath)
    return static_file(filename, root=".", mimetype=mimetype)

Do you think it could make sense to add a new param to static_file to be able to disable setting the Content-Encoding?

defnull commented 2 years ago

Hmm the more I think about it, the more I'd say this is a bug in Bottle, not just a misguided feature. Not sure how to fix this in a backwards compatible way, though. That's something for the next release.

The idea would be to disable the flawed Content-Encoding logic as it currently exists in Bottle, and add a gzip parameter. If gzip == "static" is true, bottle would automatically look for and send a *.gz version of the requested file if it exists and the browser supports it. Otherwise, it would just serve the normal file as usual. This is what nginx does with the gzip_static module.