conan-io / conan

Conan - The open-source C and C++ package manager
https://conan.io
MIT License
8.13k stars 968 forks source link

[bug] UTF8 error on Windows #16921

Closed Todiq closed 1 week ago

Todiq commented 1 week ago

Describe the bug

version: 2.7.0 conan_path: C:\venv\Scripts\conan python version: 3.9.13 sys_version: 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] sys_executable: C:\venv\Scripts\python.exe is_frozen: False architecture: AMD64 system version: 10.0.20348 platform: Windows-10-10.0.20348-SP0 system: Windows release: 10 cpu: Intel64 Family 6 Model 186 Stepping 3, GenuineIntel

How to reproduce it

Hello,

My project needs to have a .gitattributes file in order for the C++ code to compile (because of the encoding). I naively put the following lines in it:

* text=auto eol=lf
* encoding=cp1252

However, running a conan list --graph=graph.json --graph-binaries="*" --format=json > installed.json returns:

ERROR: Traceback (most recent call last):
  File "C:\venv\lib\site-packages\conan\cli\cli.py", line 294, in main
    cli.run(args)
  File "C:\venv\lib\site-packages\conan\cli\cli.py", line 193, in run
    command.run(self._conan_api, args[0][1:])
  File "C:\venv\lib\site-packages\conan\cli\command.py", line 171, in run
    info = self._method(conan_api, parser, *args)
  File "C:\venv\lib\site-packages\conan\cli\commands\list.py", line 251, in list
    pkglist = MultiPackagesList.load_graph(graphfile, args.graph_recipes, args.graph_binaries)
  File "C:\venv\lib\site-packages\conan\api\model.py", line 102, in load_graph
    graph = json.loads(load(graphfile))
  File "C:\venv\lib\site-packages\conans\util\files.py", line 144, in load
    tmp = handle.read()
  File "C:\Python\lib\codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte
ERROR: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

I tried editing the .gitattributes to have the following instead:

* text=auto eol=lf
*.cpp encoding=cp1252
*.h encoding=cp1252

but I still get the same error. Do you have any ideas? Thanks in advance.

memsharded commented 1 week ago

Hi @Todiq

Thanks for your report.

The issue seems to be coming while loading the graph.json file, as it seems to contain some non utf-8 characters. This is unexpected, Conan generated and loaded files are designed to be utf-8. So probably the generation of the package list file in the first place, without using utf-8 is the root cause of this.

Can you please try to generate that graph.json file with and without the special encoding configuration, and compare them?

Todiq commented 1 week ago

It looks like the issue is coming from the version of powershell: https://stackoverflow.com/questions/40098771/changing-powershells-default-output-encoding-to-utf-8

Here is the result of the [System.Text.Encoding]::Defaultcommand on pwsh.exe (powershell 7+):

Preamble          :
BodyName          : utf-8
EncodingName      : Unicode (UTF-8)
HeaderName        : utf-8
WebName           : utf-8
WindowsCodePage   : 1200
IsBrowserDisplay  : True
IsBrowserSave     : True
IsMailNewsDisplay : True
IsMailNewsSave    : True
IsSingleByte      : False
EncoderFallback   : System.Text.EncoderReplacementFallback
DecoderFallback   : System.Text.DecoderReplacementFallback
IsReadOnly        : True
CodePage          : 65001

And on powershell.exe (powershell 5):

IsSingleByte      : True
BodyName          : iso-8859-1
EncodingName      : Western Europe (Windows)
HeaderName        : Windows-1252
WebName           : Windows-1252
WindowsCodePage   : 1252
IsBrowserDisplay  : True
IsBrowserSave     : True
IsMailNewsDisplay : True
IsMailNewsSave    : True
EncoderFallback   : System.Text.InternalEncoderBestFitFallback
DecoderFallback   : System.Text.InternalDecoderBestFitFallback
IsReadOnly        : True
CodePage          : 1252

Meaning that conan's output is overriden by the shell running it. My command is the following: conan export-pkg . --profile:all msvc --settings:all pkg*/*:build_type=Release --no-remote --format=json > graph.json

Since I am running the python:3.9-windowsservercore-ltsc2022 docker image, I am using powershell 5.1.

I want to have as much cross-platform commands as possible between linux and windows jobs. So, instead of switching the > with Out-File -Encoding utf8, I will try to install and run pwsh inside the image. That should fix the issue for now.

memsharded commented 1 week ago

I want to have as much cross-platform commands as possible between linux and windows jobs. So, instead of switching the > with Out-File -Encoding utf8, I will try to install and run pwsh inside the image. That should fix the issue for now.

Sounds good, it sounds that powershell moving from 1252 to utf-8 is basically aligned with most of the ecosystem. So hopefully this solves the issue, please keep us posted.

Todiq commented 1 week ago

Do you think that adding a clean error instead of a traceback would be a good solution for people that may encounter the same issue?

Also, I edited my path by putting pwsh.exe first and renamed it to powershell.exe in order to be sure that commands would use that one but that led to:

[vcvarsall.bat] Environment initialized for: 'x64'
The argument '&'C:\workspace\core\build\windows-msvc-194-x86_64\generators\conanbuild.ps1'' is not recognized as the name of a script file. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

Usage: pwsh[.exe] [-Login] [[-File] <filePath> [args]]
                  [-Command { - | <script-block> [-args <arg-array>]
                                | <string> [<CommandParameters>] } ]
                  [-CommandWithArgs <string> [<CommandParameters>]
                  [-ConfigurationName <string>] [-ConfigurationFile <filePath>]
                  [-CustomPipeName <string>] [-EncodedCommand <Base64EncodedCommand>]
                  [-ExecutionPolicy <ExecutionPolicy>] [-InputFormat {Text | XML}]
                  [-Interactive] [-MTA] [-NoExit] [-NoLogo] [-NonInteractive] [-NoProfile]
                  [-NoProfileLoadTime] [-OutputFormat {Text | XML}]
                  [-SettingsFile <filePath>] [-SSHServerMode] [-STA]
                  [-Version] [-WindowStyle <style>]
                  [-WorkingDirectory <directoryPath>]

       pwsh[.exe] -h | -Help | -? | /?

PowerShell Online Help https://aka.ms/powershell-docs

All parameters are case-insensitive.

ERROR: conanfile.py (castcore/0.1): Error in build() method, line 130
        cmake.configure()
        ConanException: Error 64 while executing

My entrypoint being "powershell.exe", "-NoLogo", "-ExecutionPolicy", "Bypass", I don't really get why conan would error out on that part. Any ideas please?

memsharded commented 1 week ago

Do you think that adding a clean error instead of a traceback would be a good solution for people that may encounter the same issue?

Sure, I think something similar was added in last version for loading package lists, so a similar error check could be added for loading graph.json files.

The argument '&'C:\workspace\core\build\windows-msvc-194-x86_64\generators\conanbuild.ps1'' is not recognized as the name of a script file. Check the spelling of the name, or if a path was included, verify that the path is correct and try again.

This looks strange.

That is the launcher for the conanbuild.ps1, can you please double check if that file does exist in that folder?

Todiq commented 1 week ago

The file exists, is in UTF-8 and contains the following:

& "$PSScriptRoot/conanvcvars.ps1"
& "$PSScriptRoot/conanbuildenv-release-x86_64.ps1"
memsharded commented 1 week ago

With the last Conan version 2.7 and using -vvv you should be able to get the full command line, including the wrapping with environment scripts, maybe that full line has some hints, you can even try to copy and paste it in the build folder that will be listed in the output, maybe that way it is easier to debug and understand what is happening.

Todiq commented 1 week ago

I think that renaming pwsh exe to powershell was the culprit.

I explicitely set back pwsh to be the entrypoint in the Dockerfile, and I am running commands in CI by running pwsh -Command "...". It seems to be fixing all the issues

memsharded commented 1 week ago

I think that renaming pwsh exe to powershell was the culprit.

That could make sense, sometimes the system or the app itself can do different things based on the executable name.

I explicitely set back pwsh to be the entrypoint in the Dockerfile, and I am running commands in CI by running pwsh -Command "...". It seems to be fixing all the issues

Then it seems we could close the ticket? If I understood correctly this is then something external to Conan related to the powershell encoding. Thanks for the feedback.

Todiq commented 1 week ago

Unless you want to clean up the traceback to something clearer, sure, you can close it. Many thanks

memsharded commented 1 week ago

You are right, lets improve that error message, targeting that for 2.8

memsharded commented 1 week ago

Closed by https://github.com/conan-io/conan/pull/16936 that improves the message, for next Conan 2.8. Thanks again for the feedback.