pypa / pipx

Install and Run Python Applications in Isolated Environments
https://pipx.pypa.io
MIT License
10.23k stars 411 forks source link

Reason behind pipx run forcing the subprocess' encoding to utf-8? #1423

Open J3ronimo opened 4 months ago

J3ronimo commented 4 months ago

Hi. Today I stumbled upon a problem with pipx run, where the python tool to run prints German Umlauts like "ü". Those didn't show up correctly in the terminal, although I knew that they did when I ran the raw python scripts without pipx wrapped around.

Turns out that the reason for this lies in pipx.util, where _fix_subprocess_env sets

env["PYTHONIOENCODING"] = "utf-8"
env["PYTHONLEGACYWINDOWSSTDIO"] = "utf-8"

into the environment inherited to the subprocess, and then exec_app sets

subprocess.run( ..., encoding="utf-8")

alongside to match that.

The problem is that my German Windows terminal (cmd.exe) is not UTF-8 but CP850, therefore anything coming as utf8 from Python looks like gibberish in my terminal.

I'd like to know if there was a specific reason behind forcing the encoding here, or if anything speaks against just leaving these settings away so that Python can detect and use the encoding of the terminal, which works nicely in my case.

Thanks and cheers.

chrysle commented 4 months ago

Hmm yes, there is https://github.com/pypa/pipx/pull/335#discussion_r366156303 and context.

J3ronimo commented 4 months ago

Thanks for the link @chrysle . Unfortunately to me the commit message doesn't make clear why this was added, and it doesn't seem related to the issue that it fixes. To make the whole thing a little more graspable:

pipx run cowsay -t "hello äöü"

prints

  _________
| hello ├ñ├Â├╝ |
  =========
         \
          \
            ^__^
            (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

on my machine (Win10, cmd.exe in Windows Terminal, chcp says 850, pipx 1.5.0, Python 3.11.5), whereas just

cowsay -t "hello äöü"

in the same terminal prints everything correctly. And removing the above lines related to the subprocess encoding fixes this.

chrysle commented 4 months ago

Unfortunately to me the commit message doesn't make clear why this was added, and it doesn't seem related to the issue that it fixes.

As stated in https://github.com/pypa/pipx/pull/335#discussion_r366164868, this was added to prevent any edge cases that might occur otherwise – normally, you're on the safe side with UTF-8 encoding, because it's that widespread. But I agree the behaviour you experience is unpleasant. Probably, we should make pipx's output encoding configurable, with an environment variable prefixed PIPX_ to avoid any unintended behaviour originating from user-specified PYTHONIOENCODING.