sheredom / subprocess.h

🐜 single header process launching solution for C and C++
The Unlicense
1.13k stars 98 forks source link

Unicode support on Windows? #68

Open windowsair opened 2 years ago

windowsair commented 2 years ago

Working with character encoding on Windows is really annoying.

Passing UTF-8 characters on the command line using CreateProcessA seems to be impossible. While local code pages seem to be able to handle special characters such as Chinese and Japanese, they don't seem to be able to do that for emoji.

In my case, the way I use it is to convert your commandLineCombined to UTF16LE characters and then call CreateProcessW. My original input was UTF8 characters, and this modification seems to handle UTF8 characters properly.

What do you think of this? Thanks.

windowsair commented 2 years ago

CreateProcessA seems to internally convert to the OEM code page, which is kind of broken for characters like emoji.

sheredom commented 1 year ago

What version of windows were you running on? The reason I ask is that I thought that CreateProcessA supported UTF-8 on later Windows versions!

windowsair commented 1 year ago

Oh, and I don't seem to be receiving messages from github.

Windows 10 19044, I think that's new enough.

jlaumon commented 9 months ago

There's a manifest thing to do to enable that apparently: https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

Did you happen to try it? I'd be curious to know if it works

jlaumon commented 9 months ago

Ok, reading that page again I'm not really sure what that manifest is for. It seems to say the default code page is UTF8 without manifest for recent enough Windows (and that's the case for me, but of course I first tested with OutputDebugStringA which doesn't support UTF8).

The manifest is indeed needed, not sure how I managed to fumble my previous test. GetACP() returns CP_UTF8 when it works.

I'll give subprocess with UTF8 command lines a try in the following days and let you know how it went.

windowsair commented 9 months ago

Hi, @jlaumon

Thank you for following this. This issue has been raised for some time now. I have now replaced it with CreateProcessW and it works fine. I always use CP_UTF8 for the parent process, but I'm not sure if the child process inherits this.

sheredom commented 9 months ago

If someone wants to put together a PR that passes CI I'd happily accept it.

jlaumon commented 9 months ago

I just tried subprocess on xcopy.exe to copy file/directory names with non-ascii characters in UTF-8 and, as long as that magic manifest is there, it just works!

I am now the proud owner of 🍌.txt and 🍌_copy.txt.

matyalatte commented 3 months ago

UTF-8 support on Windows is still beta and does not work by default (at least with localized envrionments for some Asian languages.) You need a manifest file to use CreateProcessA with utf-8 as jlaumon said.

But for a cross-platform single-header library, it's not desirable that behavior changes depending on external factors (including manifest files,) I think.

As far as I know, common cross-platform libraries use the wchar version of APIs with MultiByteToWideChar like windowsair did. Or support UTF16 strings.