dotnet / msbuild

The Microsoft Build Engine (MSBuild) is the build platform for .NET and Visual Studio.
https://docs.microsoft.com/visualstudio/msbuild/msbuild
MIT License
5.23k stars 1.35k forks source link

[msbuild][c++] Mojibakes in log if link.exe fails with error #5063

Open leha-bot opened 4 years ago

leha-bot commented 4 years ago

If I compile any c++ project with linker errors, the LINK output will looks like this: image

It seems that MSBuild uses UTF-16 charset for output, as LINK.EXE from C++ uses OEM charset. It would be very cool that msbuild could at least call MultibyteCharToWideChar() API on LINK.EXE output to avoid mojibakes as they are sometimes impossible to decode (as example, if you called msbuild in Python tools like Conan Package Manager)

Steps to reproduce

Clone the https://github.com/leha-bot/kind-of-magick example repo and follow the "vcpkg-way" build instructions in README.md as vcpkg is easier to deploy on Windows hosts (and Conan adds yet another bug layer with rendering msbuild utf-16 text as utf-8 😂 🙈 ).

Expected behavior

The build log will contain proper text w/o mojibakes.

Actual behavior

See screenshot above.

Environment data

msbuild /version output:

Microsoft (R) Build Engine версии 16.5.0-preview-19562-03+d72e25031 для .NET Framework
(C) Корпорация Майкрософт (Microsoft Corporation). Все права защищены.

16.5.0.56203

Microsoft Visual Studio 2019 with Russian language pack.

OS info: Windows 8.1 (64-bit) If applicable, version of the tool that invokes MSBuild (Visual Studio, dotnet CLI, etc):

cmake --version
cmake version 3.16.2

CMake suite maintained and supported by Kitware (kitware.com/cmake).

Thank you for your work and MSBuild itself!

rainersigwald commented 4 years ago

This is related to #4870 and #4904, but I think it requires that the Link task specify its output codepage (that's possible today via ToolTask.StandardOutputEncoding). The default for that is the OEM codepage, though

https://github.com/microsoft/msbuild/blob/8aa0b87c00c6f26a565cf5e10975769dad9f378b/src/Utilities/ToolTask.cs#L216-L224

So I'm a bit surprised that it looks like Link is emitting OEM text that MSBuild is interpreting as UCS-2.

@mrtrillian, can you take a look at this and let me know what you think?

tristanlabelle commented 4 years ago

@leha-bot is there something that makes you think that msbuild is specifically decoding as UTF16/UCS2? If that was the case, I would expect the resulting string to be shorter than expected and to have characters from all kinds of alphabets.

It seems more likely to me that link.exe is producing UTF-8 or some under code page (less likely) and MSBuild is decoding it as OEM, which is set to Cyrillic code page 855. Indeed, very recently I was in touch with the link.exe team because they were always producing OEM-encoded strings and they updated to follow the current console output code page.

@leha-bot Could you check what code page chcp outputs in your console? And could you provide the version of link.exe? Do you see the same issue when calling link.exe for test-magick.obj manually? Finally, if you can provide the copyable mojibake, we should be able to figure out whether this is misinterpreted UTF-8.

yuehuang010 commented 4 years ago

Is Link invoked via VC Link Task? If so, then it is using pipe channel to communicate between link and the task. This pipe should support full utf16.