Conditional CherryPy latin-1 workaround (httphandler: don't assume filenames are latin-1 when transcoding)

evetsso commented 7 years ago

On Linux with python 3 (Arch Linux on 64-bit x86 specifically) I'd get an error like this when I have transcoding enabled and I try to play a file named "01 - 媛星.flac"

Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/cherrypy/_cprequest.py", line 670, in respond
    response.body = self.handler()
  File "/usr/lib/python3.6/site-packages/cherrypy/lib/encoding.py", line 220, in __call__
    self.body = self.oldhandler(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 60, in __call__
    return self.callable(*self.args, **self.kwargs)
  File "/home/user/cherrymusic.git/cherrymusicserver/httphandler.py", line 257, in trans
    path = codecs.decode(codecs.encode(path, 'latin1'), 'utf-8')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 23-24: ordinal not in range(256)

On Linux at least it's a pretty safe bet that filenames are utf-8. Even on other operating systems, latin-1 seems like a dangerous thing to assume. I'm not sure I understand what the intent was, for having latin-1 in here.

With this change, things at least work for me. It seems safest to make no assumptions at all about encodings in paths, and to treat them just as bytes.

coveralls commented 7 years ago

Coverage decreased (-0.004%) to 73.558% when pulling be72fc427b43fa824a8cb767d30318fe2b14dcdd on evetsso:devel into 33067200c2d58d3d85346feedddb3b4618347c91 on devsnd:devel.

tilboerner commented 7 years ago

The code you found is a workaround for a bug in CherryPy (the web framework we're using), that has apparently been fixed in newer versions. CherryPy didn't handle non-ASCII correctly, and we had to match its behavior. (See this other code example.)

Now we need to find all other instances of this workaround and check the CherryPy version to see if the workaround is needed.

Good thing we commented and documented all this properly. 🙄 😭 (We didn't.)

evetsso commented 7 years ago

To what extent do those old cherrypy versions need to be supported?

devsnd commented 7 years ago

Well, probably we can kick out all the support for those old cherrypys.

CherryMusic was always built on absolute backwards compatibility, supporting old python versions and old cherrypy versions, but I guess after a few years we can kick out all those compat-hacks.

@evetsso I'm not sure if I break anyone's cherrymusic setup by merging your PR... @tilboerner Do you have any idea what the impact might be?

tilboerner commented 7 years ago

@devsnd If we merge this, I think Cherrymusic will break for everybody who used the bootstrapping function to install CherryPy and who has non-ASCII pathnames. We would also need to change similar code in a few other places to make it not break for all others.

I think the alternatives are to make all instances of this encoding patch conditional on the CherryPy version, or we update the bootstrapping thing to handle a new minimum version of CherryPy.

tilboerner commented 6 years ago

@evetsso It took a while, but we finally merged this. Sorry for the delay!

I changed your change slightly, so instead of deleting the code, it only gets run if needed. I also messed up by creating my own commit for it (7aaa4fd6c5b88a4ba0fa3ad66775390af3736020); but I made a pro-forma merge of this PR, so you get credit. Thanks for the contribution! 🌷 🍷 🍰

devsnd / cherrymusic

Conditional CherryPy latin-1 workaround (httphandler: don't assume filenames are latin-1 when transcoding) #659