microsoft / vscode-cpptools

Official repository for the Microsoft C/C++ extension for VS Code.
Other
5.51k stars 1.55k forks source link

Printing unicode to debug console via stdout is only possible in current console input code page #10591

Open Longhanks opened 1 year ago

Longhanks commented 1 year ago

Environment

Bug Summary and Steps to Reproduce

When using "Start Debugging" or "Run Without Debugging" and the launch.json configuration "console": "internalConsole", the Debug Console displays stdout (and with "Start Debugging", strings passed to OutputDebugString - OutputDebugStringW is unaffected by this issue: If a debugger is present (which can be checked via IsDebuggerPresent()), unicode passed as UTF-16 will be printed correctly).

For the Debug Console, stdout is however always expected to convey bytes encoded in the active console input page, as retrieved with GetConsoleCP.

When running a program using "Run Without Debugging" and "console": "internalConsole" and printing strings to stdout, this is especially cumbersome: It is essentially impossible to print unicode characters that are not part of the code page that is returned by GetConsoleCP() without modifying the console input code page. But also with the proper debugger (e. g. "Start Debugging"), strings printed to stdout (not WriteConsole or OutputDebugString) are expected to be encoded in the active console input code page, which most probably does not support all unicode characters.

This snippet demonstrates the steps necessary to get the unicode character ° via stdout printed properly to the Debug Console:

#include <Windows.h>

int main() {
  const UINT console_cp = GetConsoleCP();
  char str[3];
  WideCharToMultiByte(console_cp, 0, L"°\n", 2, str, 3, NULL, NULL);
  str[2] = 0;

  WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), str, 2, NULL, NULL);

  // If GetConsoleCP == 850, identical to:
  WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), "\xF8\x0A", 2, NULL, NULL);
}

... which proves my assumption about the expected encoding of stdout.

A workaround is to call SetConsoleCP(65001); and printing UTF-8 to stdout, but this has undesired side effects. Also, it is impossible to detect that the process is being redirected to "VS Code Debug Console", so the call would always have to be present, which I would also rather avoid, if possible.

In my opinion, it would be great if it was possible to tell the "piping process" (whichever that is, e. g. closed source VS C++ debugger?) what encoding of stdout to expect.

Note that switching the client-side calls (WriteFile, printf, or wprintf) has no effect: It depends entirely on what bytes are written to stdout.

Debugger Configurations

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Debug main.exe",
            "type": "cppvsdbg",
            "request": "launch",
            "program": "main.exe",
            "args": [],
            "stopAtEntry": false,
            "cwd": "${fileDirname}",
            "environment": [],
            "console": "internalConsole",
            "internalConsoleOptions": "openOnSessionStart"
        }
    ]
}

Debugger Logs

Not applicable

Other Extensions

No response

Additional Information

Expected output in the Debug Console (reproducable in code page 850 with 0xF8 0x0A):

You may only use the C/C++ Extension for Visual Studio Code
with Visual Studio Code, Visual Studio or Visual Studio for Mac
software to help you develop and test your applications.
-------------------------------------------------------------------
°

With puts("°"); (e. g. 0xc2 0xb0):

You may only use the C/C++ Extension for Visual Studio Code
with Visual Studio Code, Visual Studio or Visual Studio for Mac
software to help you develop and test your applications.
-------------------------------------------------------------------
┬░

The biggest issue is essentially that when running using "Start Without Debugging", neither WriteConsole nor OutputDebugString work, thus, putting unicode outside of the current console input code page into the Debug Console is impossible without the SetConsoleCP hack.

Longhanks commented 1 year ago

I assume the C# extension might have had a similar issue: https://github.com/OmniSharp/omnisharp-vscode/issues/4398

But I was unable to find any commits concerning their fix, so I assume it was fixed within the closed-source components.

Longhanks commented 7 months ago

It has been a year, have you had any chance to reproduce the issue?