zopefoundation / Zope

Zope is an open-source web application server.
https://zope.readthedocs.io
Other
357 stars 101 forks source link

Export / download of files with non-latin-1 compatible names is broken #890

Closed jugmac00 closed 4 years ago

jugmac00 commented 4 years ago

Reproduction

2020-09-24 14:57:24,528 ERROR   [waitress:358][waitress-0] Exception while serving /bliss/db/manage_exportObject
Traceback (most recent call last):
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/channel.py", line 350, in service
    task.service()
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/task.py", line 171, in service
    self.execute()
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/task.py", line 479, in execute
    self.write(chunk)
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/task.py", line 311, in write
    rh = self.build_response_header()
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/task.py", line 284, in build_response_header
    return tobytes(res)
  File "/home/jugmac00/Projects/bliss_deployment/work/_/home/jugmac00/.batou-shared-eggs/waitress-1.4.4-py3.7.egg/waitress/compat.py", line 69, in tobytes
    return bytes(s, "latin-1")
UnicodeEncodeError: 'latin-1' codec can't encode character '\u0131' in position 54: ordinal not in range(256)

Actually, this was not my use case. I have a custom download (and a file view) in my Zope app, and noticed that the following code is broken for non latin-1 names:

    def download(self, REQUEST):
        """
        download the attachment
         @REQUEST : request
        """
        with open(self.completeFileName(), "rb") as f:
            ret = f.read()
        REQUEST.RESPONSE.setHeader('content-type', self.getMimeType())
        REQUEST.RESPONSE.setHeader('content-length', str(len(ret)))
        REQUEST.RESPONSE.setHeader('content-disposition', 'attachment; filename="%s"' % self.printFileName())
        return ret

I then grepped the Zope source for a possible solution and found above bug.

possible solutions

zope.file https://github.com/zopefoundation/zope.file/blob/b952af17c23f7702b95a93b28b6458b38560874b/src/zope/file/download.py#L65-L70

cherrypy https://github.com/cherrypy/cherrypy/pull/1851/files#diff-1b04e50f6724343e4321295525e2bd58R32

django-sendfile https://github.com/johnsensible/django-sendfile/blob/7a000e0e7ae1732bcea222c7872bfda6a97c495c/sendfile/__init__.py#L69-L86

In my Zope app I implemented cherrypys solution and it works like a charm.

As there is also a bug in Zope's manage_exportObject I'd like to have a function in Zope itself, given a filename and the type (inline or attachment), builds a correct header.

I'd volunteer to implement a solution during next Monday's Zope sprint.

@d-maurer @icemac - Maybe there is something I overlook? I hardly can't imagine nobody ever wanted to offer a file download for non-latin-1 names before?

P.S.: Maybe also both the comment of the setHeader method and the method itself should be updated, ie either warn or prohibit to set non-latin-1 values.

d-maurer commented 4 years ago

Jürgen Gmach wrote at 2020-9-24 13:21 +0000:

Reproduction

  • upload a file via ZMI with the name "ıqrm.png" (please note, the first letter is a turkish i)

  • try to export it

  • waitress error

... @d-maurer @icemac - Maybe there is something I overlook? I hardly can't imagine nobody ever wanted to offer a file download for non-latin-1 names before?

I think exactly that is the case: only from Zope 4 onward could ids contain non ASCII characters -- and apparently, you have been the first to try to export an object with an id invalid in HTTP headers (i.e. non latin-1).

Obviously, manage_exportObject is not yet adapted to the removal of this restriction.

https://github.com/cherrypy/cherrypy/pull/1851/files#diff-1b04e50f6724343e4321295525e2bd58R32 This solution looks strange to me: it uses url encoding and an UTF-8 prefix -- both things unknown to HTTP (RFC 7230-7237). Instead, HTTP builds on top of MIME (RFC 5322, RFC 2045 and friends) which has a different quoting for header fields.

d-maurer commented 4 years ago

Dieter Maurer wrote at 2020-9-28 07:42 +0200:

...

https://github.com/cherrypy/cherrypy/pull/1851/files#diff-1b04e50f6724343e4321295525e2bd58R32 This solution looks strange to me: it uses url encoding and an UTF-8 prefix -- both things unknown to HTTP (RFC 7230-7237). Instead, HTTP builds on top of MIME (RFC 5322, RFC 2045 and friends)

I see that this is covered by RFC 5987 (and RFC 2231) - thus it is likely correct.

dataflake commented 4 years ago

I have now spent a while unsuccessfully trying to make this patch Zope 4 compatible, which means it has to work under both Python 2 and Python 3. I won't spend additional time, it is becoming a rat hole. So this fix is not in Zope 4 unless someone else spends the time making it work.

I would ask anyone who works in patches that are interesting for both Zope 4 and Zope 5 to PLEASE develop the patch on Zope 4. That makes transferring it a lot easier than trying to fix code that was written for Python 3 only.

jugmac00 commented 4 years ago

Sorry, Jens. I completely forgot about the Python 2 support already.

Personally, I am completely fine when the fix is Zope 5 only.

dataflake commented 4 years ago

I'm fine with leaving it on Zope 5 as well.

d-maurer commented 4 years ago

Jens Vagelpohl wrote at 2020-9-30 03:10 -0700:

I have now spent a while unsuccessfully trying to make this patch Zope 4 compatible, which means it has to work under both Python 2 and Python 3. I won't spend additional time, it is becoming a rat hole. So this fix is not in Zope 4 unless someone else spends the time making it work.

I will take over -- implementing my idea to support this transparently in HTTPResponse. Before I start, I will read the new HTTP 1.1 spec (i.e. RFC 7230 and friends) to learn about new tendancies regarding HTTP.

I will check upload, too.

-- Dieter