bentonstark / starksoft-aspen

.net / mono security and cryptography library that provides client support for ftps, gnupg, smartcard, and socks / http proxies
106 stars 49 forks source link

Force FTP encoding to UTF8 #22

Closed cassiobock closed 8 years ago

cassiobock commented 8 years ago

Hi, is there any way that I can force the encoding to UTF8 in one FTP connection?

My current server features are the following:

211-Extensions supported:
 EPRT
 IDLE
 MDTM
 SIZE
 MFMT
 REST STREAM
 MLST type*;size*;sizd*;modify*;UNIX.mode*;UNIX.uid*;UNIX.gid*;unique*;
 MLSD
 AUTH TLS
 PBSZ
 PROT
 TVFS
 ESTA
 PASV
 EPSV
 SPSV
 ESTP
211 End.

Thanks!

bentonstark commented 8 years ago

The FTPS client defaults to Encoding.UTF8 internally with the fallback Encoding.ASCII. You can get the current encoding setting the client is using once a connection is open via the FtpsClient.CharacterEncoding property but it is all based on what FTP server features are supported. The FTPS client will always attempt to set UTF8 if the feature is supported using the method FtpsClient.TrySetUtf8On(). If that feature is not supported it will fall back to ASCII encoding.

Based on the features you show, UTF8 is not a supported feature option you can control via FTP server control commands e.g. "OPT UTF8 ON". You can also issue commands directly to the server to see what response you get via FtpsClient.Quote("OPT UTF8 ON") but based on the FEAT list you will get something like "Feature Not Supported."

What type of FTP server are you connecting to?

cassiobock commented 8 years ago

I'm having issues when I try to create folders or files with special characters, such as á and é. Turns out that I didn't need the UTF8, but the default encoding of my system.

After some time I found out that I could use this reflection code to force the encoding:

typeof(FtpsBase).GetProperty("Encoding", BindingFlags.Instance | BindingFlags.NonPublic).SetValue(ftpClient, Encoding.UTF8, null);

I don't know if it is the best solution, but it worked for me. I previously used the Starksoft.Net.Ftp, and the Encoding could be changed like this: FtpClient.CharacterEncoding = Encoding.Default

trlthiago commented 8 years ago

Just to add information for @cassiobsilva issue, he is using PureFTP and the linux localisation settings are:

[root@SERVER01 /home/site]# locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

The case is when files are created by any other application instead of StarkSoft (bash, putty, filezilla, website running on apache, etc...).

At a glance it seems to me that soon StarkSoft opens the connection, it tries to set UTF-8 but since PureFTP does not support it natively, then the StarkSoft falls back to ASCII and the TransferText Method instantiate StreamReader with Encoding.ASCII.

The proposed workaround forcing the encoding to Encoding.UTF8 via reflection works fine since it instantiate the StreamReader with Encoding.UTF8 that was the original file/folder name encoding.

bentonstark commented 8 years ago

I need to make the FtpClient.Encoding property public rather than internal. This should give the control needed to change the character encoding to a specific geographic encoding format. Thanks for pointing this issue out.

bentonstark commented 8 years ago

Git master has the latest changes which make the FtpsClient.Encoding property public so you do not need to use reflection to get access to change the value.

trlthiago commented 8 years ago

That's great, thank you @bentonstark :)