Open giampaolo opened 10 years ago
From huangkan...@gmail.com on April 30, 2013 23:08:23
Hi there.
I solved this problem by inspecting the code and subclass FTPHandler myself.
I'll paste the code for anyone who needs it.
#code starts
from asynchat import async_chat
class EncodedProducer:
def __init__(self, producer):
self.producer = producer
def more(self):
return self.producer.more().decode("utf8").encode(encoding)
class EncodedHandler(FTPHandler):
def push(self, s):
async_chat.push(self, s.encode(encoding))
def push_dtp_data(self, data, isproducer=False, file=None, cmd=None):
if file==None:
if isproducer:
data=EncodedProducer(data)
else:
data=data.decode("utf8").encode(encoding)
FTPHandler.push_dtp_data(self, data, isproducer, file, cmd)
def decode(self, bytes):
return bytes.decode(encoding, self.unicode_errors)
#code ends
encoding stands for the target encoding you wish to transform to.
Using EncodedHandler instead of FTPHandler would help solve this problem
From gc...@loowis.durge.org on May 01, 2013 04:54:03
The problem here is that (AFAIK) the FTP protocol has absolutely *no* means of
specifying or querying which character-encoding is in use - you just have to
'hope' that the client and server are using the same encoding :-( RFC2640 (
https://tools.ietf.org/html/rfc2640 ) specifies that the character encoding
SHOULD be UTF-8, and pyftpdlib is now Unicode / RFC2640 compliant.
https://code.google.com/p/pyftpdlib/issues/list?can=1&q=unicode So I guess your
code serves as an example of how you could _force_ a different encoding if you
can't use UTF-8, but IMHO it shouldn't be built into the library... of course
Giampaolo may disagree ;-)
From g.rodola on May 01, 2013 16:31:41
Yes Andrew is right. I didn't make the server encoding configurable exactly for
this reason: as per RFC guideline client and server have no way to agree on a
specific encoding, therefore I thought it was better to just stick with UTF-8
as dictated by RFC and be done with it.
If on one hand this is "the right thing to do", on the other hand perhaps there
are cases where changing the default server encoding in order to support
misbehaving clients might be desirable (note: at the cost of 'breaking'
compliant ones). If this is the case I'd like to hear more about the scenario
the OP is facing (in detail the FTP client used and what happens by using UTF-8).
That said, the code shown above changes the encoding of the control connection
(and that might be "right") but also applies an encoding for the data exchanged
through the data connection, and that is something which should be done only
for the listing commands (LIST, MLSD, etc), not when transmitting files.
What you want to do instead is override AbstractedFS's format_list() and
format_mlsx() methods and leave FTPHandler.push_dtp_data alone:
class CustomFS(AbstractedFS):
def format_list(self, *args, **kwargs):
generator = AbstractedFS.format_mlst(self, *args, **kwargs)
for item in generator:
yield item.decode("utf8").encode(YOUR_ENCODING)
# same for format_mlsx()
If we decide to make server encoding configurable we can avoid to go through
all these troubles, but I'd like to hear OP's scenario first in order to figure
out if it's actually worth the effort.
From gc...@loowis.durge.org on May 01, 2013 17:29:16
Just a quick note - Giampaolo's defintely right that you don't want to mess
about with the 'encoding' for the actual file data (would give corrupted
files), but presumably _if_ the encoding of filenames for the LIST and MLSD
commands is being altered, then the encoding of the filenames for the
STOR/RETR/DELE/etc. commands would need to be altered too?
From g.rodola on May 01, 2013 17:38:30
Yes, but apparently he did that already by overriding FTPHandler.decode().
From huangkan...@gmail.com on May 01, 2013 18:05:30
Hi there.
Thanks for all your responses.
Well, I understood that these "functions" would not be included in the library
source cause it is not really a function that should be considered...
But let's talk about the codes I pasted, hmmmmm.... I know it may sound like a
dirty hack, but shouldn't push_dtp_data always be called with a non-None file
argument if it is transmitting files? So would it be nice to distinguish list
commands from file data by checking the file argument in push_dtp_data? Does
this method have any kind of limitations ?
Cheers.
From g.rodola on May 01, 2013 18:09:43
push_dtp_data() is called with a "producer" argument also for listing commands,
not only for files-related ones. That aside, a 'cmd' argument is also passed,
so you might want to inspect that.
giampaolo commented on 29 May 2014 From huangkan...@gmail.com on April 30, 2013 23:08:23
from asynchat import async_chat
class EncodedProducer:
def __init__(self, producer):
self.producer = producer
def more(self):
return self.producer.more().decode("utf8").encode(encoding)
class EncodedHandler(FTPHandler):
def push(self, s):
async_chat.push(self, s.encode(encoding))
def push_dtp_data(self, data, isproducer=False, file=None, cmd=None):
if file==None:
if isproducer:
data=EncodedProducer(data)
else:
data=data.decode("utf8").encode(encoding)
FTPHandler.push_dtp_data(self, data, isproducer, file, cmd)
def decode(self, bytes):
return bytes.decode(encoding, self.unicode_errors)
I got this error:
handler = EncodedHandler()
TypeError: __init__() missing 2 required positional arguments: 'conn' and 'server'
What does this have to do with the original issue?
The top
From huangkan...@gmail.com on May 01, 2013 05:05:59
Hi there. I've been working with your library for quite a while, and it was just so simple yet worked like a charm. I'm here to suggest some enhancements. There is a common problem to deal with FTP servers that is about encoding. While the client file name encoding is different from server file name encoding, we would have a lot of problems dealing with that. This is quite common here for Windows' default Chinese file name encoding is GBK yet Linux's is UTF-8. So I wish to add some transparent encoding transform function in my ftp server. And I think that it would need some implementation in the library source... Also, if this function can simply be implemented by subclassing FTPHandler or something, please let me know. Cheers.
Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=257 This code is work for me.When using this lib to write a simple ftp server on macOS,if you are using Chinese filename, should use EncodedHandler in your code, and set encoding="GB18030".
From huangkan...@gmail.com on May 01, 2013 05:05:59
Original issue: http://code.google.com/p/pyftpdlib/issues/detail?id=257