Open kazurayam opened 1 week ago
v3.2.0, org.mockftpserver.fake.command.ListCommandHandler has the following code fragment:
String result = StringUtil.join(lines, endOfLine());
result += result.length() > 0 ? endOfLine() : "";
sendReply(session, ReplyCodes.TRANSFER_DATA_INITIAL_OK);
session.openDataConnection();
LOG.info("Sending [" + result + "]");
session.sendData(result.getBytes(), result.length());
The result
variable is a reply to LIST command, which is a java.lang.String instance, is something like
-rwxrwxrwx 1 none none 51 Nov 19 21:59 a日本語で遊ぼう.txt
-rwxrwxrwx 1 none none 16 Nov 19 21:59 foobar.txt
When the result
variable contains NON-Latin1 characters like "日本語で遊ぼう", the call result.getBytes()
will create an byte array derived from UNICODE. This means, the byte array derived from UNICODE will be replied from the FTP Server to FTP Client.
Now, FTP Client get a byte array, it has to convert the byte array back to a Java String. --- This is not what we usually do.
We usually & naively assume is that the byte array is an representation of a String encoded with UTF-8.
So the FTP Client tries to decode the byte array, which was originally a straight UNICODE. Then we will see garbled characters.
I want FakeFtpServer to optionally allow encoding file names String in UNICODE into a byte array using UTF-8. This means, I want to change
session.sendData(result.getBytes(), result.length());
to
if (allowUTF8 == true) {
byte[] ba = result.getBytes(StandardCharset.UTF_8);
session.sendData(ba, ba.length)
}
Using MockFtpServer, I want to simulate a FTP Server that encodes file names in UTF-8.
The v3.2.0 MockFtpServer does not support encoding file names (in UNICODE) into UTF-8.
So I want to change MockFtpServer so that it encodes file names as Java String in UNICODE into UTF-8.
As for the RFC, see the following sources
https://wiki.filezilla-project.org/Character_Encoding
https://filezilla-project.org/specs/rfc2640.txt