cifsd-team / ksmbd

ksmbd kernel server(SMB/CIFS server)
151 stars 23 forks source link

UTF-8 characters encoded in 4 bytes not supported in filenames #598

Open nvxos opened 10 months ago

nvxos commented 10 months ago

Trying to use UTF-8 characters encoded in 4 bytes in filenames doesn't seem supported. It triggers a "file or folder does not exist" error in Windows for example. Same type of error on Linux. The characters in question are for example some emojis like "🔥" (https://apps.timwhitlock.info/unicode/inspect/hex/1F525), while characters encoded in 3 bytes or less don't seem to pose a problem like for example the emoji "❤️" (https://apps.timwhitlock.info/unicode/inspect?s=%E2%9D%A4%EF%B8%8F).

Some reference I found about the subject and other helpful links I used to pinpoint the issue: https://en.wikipedia.org/wiki/Unicode#Code_planes_and_blocks https://apps.timwhitlock.info/emoji/tables/unicode https://apps.timwhitlock.info/unicode/inspect

For some context, this issue happened on a deployment of KSMBD on FreeboxOS, a french ISP (Free) router (Freebox) OS. After issuing a ticket on their bug tracker (https://dev.freebox.fr/bugs/task/38504), they asked me to issue a ticket here.

@mmakassikis

namjaejeon commented 10 months ago

@mmakassikis Do you have the time to fix it ? I guess that we need to compare unicode.c in ksmbd and cifs_unicode.c. cifs.ko seems to use utf8_to_utf32 or utf8s_to_utf16s instead of ->char2uni. char2uni doesn't fully support utf8.

namjaejeon commented 8 months ago

@nvxos Can you check if problem is improved with the following patch ? (https://github.com/namjaejeon/ksmbd/commit/f389804c2a547cfcbeb7daaef3aa0b78b16c8d71)

mmakassikis commented 8 months ago

@namjaejeon I have tested the patch and this fixes renaming a file named "🔥", which didn't previously work. I left a couple comments on the patch.

namjaejeon commented 8 months ago

@mmakassikis Thanks for your review. I updated the patch(https://github.com/namjaejeon/ksmbd/commit/8dffdce119fbd2f5e41cb453c951af1e9d50944e). Let me know if you find any issue.