google / brotli

Brotli compression format
MIT License
13.3k stars 1.22k forks source link

Output file with path containing non-Latin Unicode characters on Windows #1133

Open davschne-unity opened 4 months ago

davschne-unity commented 4 months ago

I've noticed some unusual behavior on Windows with output file paths containing non-Latin Unicode characters:

Non-Latin characters in the output filename

$ brotli.exe -o "Lietuvių Kalbos Žodynu.br" "Hello.txt"

The command works, but the output filename is Lietuviu Kalbos Žodynu.br, not Lietuvių Kalbos Žodynu.br, as expected. (notice "u" versus "ų")

Non-Latin characters in the name of the directory containing the output file

$ brotli.exe -o "Lietuvių Kalbos Žodynu\Hello.br" "Hello.txt"

the command fails with output:

failed to open output file [Lietuviu Kalbos Äodynu\Hello.br]: No such file or directory

By contrast, using these same non-Latin characters in the input path seems to work fine. And the problem doesn't seem to occur on MacOS.

eustas commented 4 months ago

I believe the problem is that CLI code works with 8-bit filenames. Going to fix that as soon as I have time. Feel free to ping me ~once a week. Thank you.