jazzband / django-downloadview

Serve files with Django.
https://django-downloadview.readthedocs.io
Other
380 stars 58 forks source link

mod_xsendfile serving files with url-encoded names #85

Open AdrianLC opened 10 years ago

AdrianLC commented 10 years ago

Hello,

This is my first time using both xsendfile and downloadview so I might be doing something wrong. I apologize if that is the case.

So I'm trying to serve files with a non ascii characters and getting the file names all messed up.

My httpd.conf:

    Alias /priv/ /var/www/wsgi/site/priv/
    <Directory /var/www/wsgi/site/priv>
        Require all denied
        XSendFile On
        XSendFilePath /var/www/wsgi/site/priv/jobfiles
    </Directory>

settings.py:

MIDDLEWARE_CLASSES += ('django_downloadview.SmartDownloadMiddleware', )
DOWNLOADVIEW_BACKEND = 'django_downloadview.apache.XSendfileMiddleware'
DOWNLOADVIEW_RULES = [
    {
        'source_url': '/priv/',
        'destination_dir': "/var/www/wsgi/site/priv",
    },
]

The view:

class DownloadJobResultsView(ObjectDownloadView):
    model = Job
    file_field = 'zipped_results'

I've been reading the changelogs from both downloadview an mod_xsenfile and it appears that downloadview might have been upgraded too soon ???

Below is the last update of xsendfile which is still on beta and not available in the repositories of Ubuntu or Fedora.

Version 1.0
    Unescape/url-decode header value to support non-ascii file names
    XSendFileUnescape setting, to support legacy applications
    X-SENDFILE-TEMPORARY header and corresponding AllowFileDelete flag
    Fix: Actually look into the backend-provided headers for Last-Modified

So, if I understand correctly, previous versions don't url-decode the file names...

EDIT: Just tried with xsendfile 1.0 built from source and still have the problem so... I have no idea of what is happening...

Response headers:

(Status-Line)       HTTP/1.1 200 OK
Date                Thu, 22 May 2014 20:16:48 GMT
Server              Apache/2.4.9 (Fedora) PHP/5.5.12 mod_wsgi/3.4 Python/2.7.5
Content-Disposition attachment; filename=%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip; filename*=UTF-8''%255Bpbs%252Bssh%2540example.com%253A22%255D-%255B2067%255D.zip
Content-Language    en
Vary                Accept-Language,Cookie
X-Frame-Options     SAMEORIGIN
X-Sendfile          /var/www/wsgi/site/priv/jobfiles/results/%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip
Content-Length      0
Keep-Alive          timeout=5, max=98
Connection          Keep-Alive
Content-Type        application/zip; charset=utf-8

Do you know how I could fix this?

Many thanks, Adrian

benoitbryon commented 10 years ago

This is my first time using both xsendfile and downloadview so I might be doing something wrong. I apologize if that is the case.

And I am not an expert of xsendfile... But let's try to find what's going wrong ;)

If I understood correctly, in your client, you request a dowload of [pbs+ssh@example.com:22]-[2067].zip file. What is the filename shown in your client? %5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip?

Just to be sure, does your setup work fine with full-ascii filenames?

It looks like the anomaly is in the Content-Disposition attachment; filename=%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip; filename*=UTF-8''%255Bpbs%252Bssh%2540example.com%253A22%255D-%255B2067%255D.zip response header. filename seems urlencoded. And filename*UTF-8'' seems double urlencoded. It think it should be filename contains only ascii, and filename*UTF-8'' is urlencoded, because of the behaviour of django_downloadview.response.content_disposition(filename)

That said, at the moment I do not know what is the cause of the anomaly above... I think we could check 2 things:

(note: I cannot promise I will investigate today...)

benoitbryon commented 10 years ago
>>> from django_downloadview import response
>>> response.content_disposition('[pbs+ssh@example.com:22]-[2067].zip')
"attachment; filename=[pbs+ssh@example.com:22]-[2067].zip; filename*=UTF-8''%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip"

If django_downloadview is given not-urlencoded filename, it should return the content-disposition as shown above.

Another thing that could happen is: django_downloadview returns the content-disposition above, and xsendfile urlencodes it again before sending it to the client. You can check what django_downloadview returns by deactivating xsendfile in apache configuration, then perform the request again and watch the response. Since xsendfile does not handle the response, you should see django_downloadview's raw "internal redirect" response, using x-sendfile headers.

AdrianLC commented 10 years ago

Thank you for answering so soon.

If I understood correctly, in your client, you request a dowload of [pbs+ssh@example.com:22]-[2067].zip file. What is the filename shown in your client? %5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip?

Yes.

Just to be sure, does your setup work fine with full-ascii filenames?

Yes (for the file name, please see below).

what does django_downloadview.response.content_disposition('[pbs+ssh@example.com:22]-[2067].zip') return?

>>> from django_downloadview import response
>>> response.content_disposition('[pbs+ssh@example.com:22]-[2067].zip')
"attachment; filename=[pbs+ssh@example.com:22]-[2067].zip; filename*=UTF-8''%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip"

Another thing that could happen is: django_downloadview returns the content-disposition above, and xsendfile urlencodes it again before sending it to the client. You can check what django_downloadview returns by deactivating xsendfile in apache configuration, then perform the request again and watch the response. Since xsendfile does not handle the response, you should see django_downloadview's raw "internal redirect" response, using x-sendfile headers.

I was getting the same response even after restarting httpd with XSendfile Off and the LoadModule commented out. Turns out that my xsendfile was never enabled. I realized I was serving 0 byte files. Now, I moved the XSendFile On outside of the <Directory> and this is in my logs:

(2)No such file or directory: [client 127.0.0.1:53519] xsendfile: cannot open file: /var/www/wsgi/site/priv/jobfiles/results/%5Bpbs%2Bssh%40example.com%3A22%5D-%5B2067%5D.zip, referer: http://localhost/execution/list/

so... perhaps my original assumption (that xsendfile pre-1.0 doesn't url-decode) was right after all ??? Because the file is there for sure.

EDIT: I tried again with xsendfile 1.0. It serves the file with the correct size but still the wrong name. Also, with XSendFileUnescape off (which supposedly restores <1.0 behaviour) it'll throw the same "No such file" error.

AdrianLC commented 10 years ago

Sorry, I closed it by mistake.

AdrianLC commented 10 years ago

Think I've found where the double encode is happening.

FileSystemStorage's url() method escapes the filename with django.utils.encoding.filepath_to_uri. The method is called in ProxiedDownloadMiddleware through the url property of FieldFile.

pkaczynski commented 7 years ago

135 Will this fix this issue?