mamba-org / mamba

The Fast Cross-Platform Package Manager
https://mamba.readthedocs.io
BSD 3-Clause "New" or "Revised" License
6.96k stars 359 forks source link

Sub-process output parsing assumes UTF-8 but is not always UTF-8 #3591

Open Klaim opened 1 week ago

Klaim commented 1 week ago

Troubleshooting docs

Anaconda default channels

How did you install Mamba?

Micromamba

Search tried in issue tracker

yes

Latest version of Mamba

Tried in Conda?

I have this problem with Conda as well, without using Mamba

Describe your issue

Through figuring out https://github.com/mamba-org/mamba/pull/3584 we realized that currently when micromamba (or mamba) calls python and then parses it's output, the code assumes that the output is UTF-8. However python is designed to output using the current system/console encoding. When it is not UTF-8 and the data is detected as not being UTF-8 we can get errors, otherwise we are essentially processing incorrect data without explicit errors. This issue can be most visible on Windows which default encoding is not UTF-8 (it can be set to UTF-8, making the issue disappear), but it can also appear on any other system which default encoding is not UTF-8.

That problem was worked-around so far by adding in the CI scripts environment variables to request python to explicitly output UTF-8 which is why our CI didnt detect the issue when new python-calling code was added to mamba/micromamba, while users can.

https://github.com/mamba-org/mamba/pull/3584 demonstrates that we could set that variable always through the sub-process launching command instead of requesting users to do it from externally. We do know we are calling python at that point and also know what encoding we expect to receive. We need to generalize this solution to the other sub-process launching, including python but also the other ones. Output of these sub-process when parsed should always be treated as system-encoding (reproc doesnt change that apparently) and we need to make sure that if we parse such output it is understood or converted.

Once that is done, we can removed the ci scripts flags/env variables that hides the problem.

mamba info / micromamba info

micromamba 2.0.3 exposes that faulty behavior

Logs

N/A

environment.yml

N/A

~/.condarc

N/A