inveniosoftware / invenio-files-rest

REST API for uploading/downloading files for Invenio.
https://invenio-files-rest.readthedocs.io
MIT License
9 stars 67 forks source link

views: fix X-Accel should be sent from storage class not from the view #274

Open lnielsen opened 2 years ago

lnielsen commented 2 years ago

PR #245 added support for serving files through NGINX instead (via X-Accel) headers.

The support was added in the view layer instead of the storage class. It should however be storage system dependent. Also, this ensure that it's usable from Invenio-records-resources which has it's own views for file downloading.

In addition, the way it's implemented now, it also by-passes a permission check

Tasks:

  1. Remove these lines.
  2. Add support in the storage class's send_file() similar to the lines that was removed.

It's very important to verify that the file being served out has the correct HTTP headers which send_stream is adding. It's a security issues if they don't.

Please ping @lnielsen before you get started and for a review.

lnielsen commented 2 years ago

Pointed out by Nico: Likely we need a better method than the global factory function since it's needed per storage.

Example from S3: https://github.com/inveniosoftware/invenio-s3/blob/c6aee61cce6389ba56846d5b03c4d714b817e41b/invenio_s3/helpers.py#L19

lnielsen commented 2 years ago

From @max-moser:

  # have nginx serve uploaded files via X-Accel-Redirect via the Invenio-Files-REST
  # "FILES_REST_XSENDFILE_*" config items
  location /uploaded_data {
    internal;
    root /mnt/data-volume;
    gzip off;
  }

of course, /mnt/data-volume (containing the uploaded files) would have to be mounted in the docker container and in invenio.cfg, i added the following:

class TUWStorage(PyFSFileStorage):
    def send_file(self, filename, **kwargs):
        if current_app.config["FILES_REST_XSENDFILE_ENABLED"]:
            response = make_response()
            redirect_url_base = "/uploaded_data"
            redirect_url_key = self.fileurl[len("/mnt/data-volume/uploaded_data/") :]
            response.headers[
                "X-Accel-Redirect"
            ] = f"{redirect_url_base}/{redirect_url_key}"
            response.headers["X-Accel-Buffering"] = "yes"
            response.headers["X-Accel-Limit-Rate"] = "off"

            mimetype = mimetypes.guess_type(filename)[0]
            if mimetype is not None:
                mimetype = sanitize_mimetype(mimetype, filename=filename)

            if mimetype is None:
                mimetype = "application/octet-stream"

            response.mimetype = mimetype

            # make the browser use the file download dialogue for the file
            response.headers[
                "Content-Disposition"
            ] = f'attachment; filename="{filename}"'

            # set some security-related headers for the download
            response.headers["Content-Security-Policy"] = "default-src 'none';"
            response.headers["X-Content-Type-Options"] = "nosniff"
            response.headers["X-Download-Options"] = "noopen"
            response.headers["X-Permitted-Cross-Domain-Policies"] = "none"
            response.headers["X-Frame-Options"] = "deny"
            response.headers["X-XSS-Protection"] = "1; mode=block"

            # set the host header, because we require it in our nginx configuration
            response.headers["Host"] = hostname
            return response
        return super().send_file(filename, **kwargs)

def storage_factory(**kwargs):
    return pyfs_storage_factory(filestorage_class=TUWStorage, **kwargs)

FILES_REST_XSENDFILE_ENABLED = True
FILES_REST_STORAGE_FACTORY = storage_factory