4udak / pyftpdlib

Automatically exported from code.google.com/p/pyftpdlib
Other
1 stars 1 forks source link

Unable to complete directory listing with invalid UTF8 characters #280

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
We had a user manage to get an invalid UTF8 byte sequence in one of their file 
names (I believe this was done through external software, not uploading a file 
via FTP).  After that, they were no longer able to retrieve a directory list, 
the FTP server threw the following exception:

ERROR unhandled exception in instance <pyftpdlib.handlers.TLS_DTPHandler object 
at 0x8f3ca2c>
Traceback (most recent call last):
File "pyftpdlib.handlers", line 1658, in push_dtp_data
File "pyftpdlib.handlers", line 599, in push_with_producer
File "asynchat", line 190, in push_with_producer
File "pyftpdlib.handlers", line 605, in initiate_send
File "asynchat", line 226, in initiate_send
File "pyftpdlib.handlers", line 992, in more
File "pyftpdlib.filesystems", line 564, in format_mlsx
File "encodings.utf_8", line 16, in decode
UnicodeDecodeError: 'utf8' codec can't decode byte 0xe4 in position 43: invalid 
continuation byte

I'm afraid I don't have any way of determining which filename was the issue, 
but our workaround for this was this patch:

--- a/pyftpdlib-1.2.0/pyftpdlib/filesystems.py
+++ b/pyftpdlib-1.2.0/pyftpdlib/filesystems.py
@@ -448,7 +448,7 @@ class AbstractedFS(object):
                     # http://bugs.python.org/issue683592
                     file = os.path.join(bytes(basedir), bytes(basename))
                     if not isinstance(basename, unicode):
-                        basename = unicode(basename, 'utf8')
+                        basename = unicode(basename, 'utf8', 'ignore')
             else:
                 file = os.path.join(basedir, basename)
             try:
@@ -561,7 +561,7 @@ class AbstractedFS(object):
                     # http://bugs.python.org/issue683592
                     file = os.path.join(bytes(basedir), bytes(basename))
                     if not isinstance(basename, unicode):
-                        basename = unicode(basename, 'utf8')
+                        basename = unicode(basename, 'utf8', 'ignore')
             else:
                 file = os.path.join(basedir, basename)
             # in order to properly implement 'unique' fact (RFC-3659,

Original issue reported on code.google.com by d...@devicenull.org on 4 Feb 2014 at 10:26

GoogleCodeExporter commented 9 years ago
Committed in r1245. Thanks.

Original comment by g.rodola on 4 Feb 2014 at 10:33

GoogleCodeExporter commented 9 years ago

Original comment by g.rodola on 3 Jun 2014 at 9:46