nat-n / poethepoet

A task runner that works well with poetry.
https://poethepoet.natn.io/
MIT License
1.4k stars 58 forks source link

Poe 0.16.5 doesn't work on some non-utf-8 Windows environment #111

Closed privet-kitty closed 1 year ago

privet-kitty commented 1 year ago

Hi. Thank you for maintaining this awesome task-runner. I ran into a problem on Windows with Japanese language environment. The minimal reproducible example would be as follows:

  1. Put the following pyproject.toml under a directory whose path contains multi-byte characters (e.g. C:\Users\username\テスト).

    [tool.poetry]
    name = "test-package"
    version = "0.1.0"
    description = ""
    authors = []
    
    [tool.poetry.dependencies]
    python = "^3.10"
    
    [tool.poetry.group.dev.dependencies]
    poethepoet = "^0.16.5"
    
    [build-system]
    requires = ["poetry-core"]
    build-backend = "poetry.core.masonry.api"
    
    [tool.poe.tasks]
    python = "python --version"
  2. Set up the environment inside the directory bypoetry config virtualenvs.in-project true --local and poetry install.
  3. Run poe python and see it fail.
    • I guess it succeeds on a pure utf-8 environment. To forcibly reproduce the error, setting PYTHONIOENCODING="cp932" in advance would be a way, for example.

The error would be as follows.

Poe => python --version
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\user\テスト\.venv\Scripts\poe.exe\__main__.py", line 7, in <module>
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\__init__.py", line 33, in main
    result = app(cli_args=sys.argv[1:])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\app.py", line 67, in __call__
    return self.run_task() or 0
           ^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\app.py", line 98, in run_task
    return self.task.run(context=context, extra_args=self.ui["task"][1:])
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\task\base.py", line 239, in run
    return self._handle_run(
           ^^^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\task\cmd.py", line 44, in _handle_run
    return context.get_executor(self.invocation, env, self.options).execute(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\executor\poetry.py", line 27, in execute
    poetry_env = self._get_poetry_virtualenv()
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\user\テスト\.venv\Lib\site-packages\poethepoet\executor\poetry.py", line 76, in _get_poetry_virtualenv
    .decode()
     ^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 15: invalid start byte

The cause is, bytes.decode() assumes utf-8 when no encoding is given, but what poetry env info -p outputs is not necessarily in utf-8. In my environment, the correct encoding is cp932 when the python output is piped. It appears to be a traditional problem on sys.stdout.encoding, which python "wisely" decides depending on the execution environment and the output stream.[^1]

[^1]: For your information, the related stackoverflow posts would be Why do the sys.stdout.encoding differ when output is piped (in Python2.x)? or Setting the correct encoding when piping stdout in Python

I confirmed that the example above failed on 0.16.5 but not 0.16.4. I guess it's just because the recent changes (https://github.com/nat-n/poethepoet/commit/15edd33ae1a2900eb1c1d3b94946c3509fe1f0c5) have made it so that _get_poetry_virtualenv is always called. It should be an inapparent problem in older versions too.

As for the solution, it would be a way, setting PYTHONIOENCODING to utf-8 before invoking poetry env info -p (https://github.com/nat-n/poethepoet/commit/67707aad5425b7262de27f1388be35f082bed344). I'm not confident about how decent this workaround is, but anyway that resolved my issue.

nat-n commented 1 year ago

Hi @privet-kitty,

Thanks for reporting this issue and suggesting a solution. I plan to include your fix in an upcoming release.

I guess that environment variable should also be set/respected when running tasks that capture output in the executor like this. Would you foresee any downside with that?

nat-n commented 1 year ago

@privet-kitty This feature is now available as part of v0.17.0.

privet-kitty commented 1 year ago

@nat-n Thank you for merging!

I guess that environment variable should also be set/respected when running tasks that capture output in the executor like this. Would you foresee any downside with that?

Actually I don't quite understand the necessity that you care PYTHONIOENCODING either in save_task_output or exec_via_subproc, because users can easily control the I/O of their tasks, and more importantly they know that they can without being taught (unlike poetry env info -p problem) . However, I find no obvious downside and I'm not against that.

nat-n commented 1 year ago

@privet-kitty yes, users can control it (still), though my intention is that this way they're less likely to have to think about it. Thanks for the feedback. :)