Textualize / rich-cli

Rich-cli is a command line toolbox for fancy output in the terminal
https://www.textualize.io
MIT License
2.99k stars 77 forks source link

Multiple emoji encoding issues on Windows #67

Open tmr232 opened 1 year ago

tmr232 commented 1 year ago

When trying to use rich to print files with emoji on Windows, there are some encoding issues.

Below are 2 cases I encountered.

Garbled text instead of emoji

When running rich broken-emoji.md (broken-emoji.md - a text file with nothing but the 😊 emoji in it) on Windows (in Windows Terminal), I get the following:

😊

If I run Get-Content broken-emoji.md or run rich inside WSL, I get the emoji printed as expected.

Rich fails to print entirely

When running rich cannot-print.md (cannot-print.md - only contains the 🤝 emoji) on Windows, I get:

unable to read .\cannot-print.md: 'charmap' codec can't decode byte 0x9d in position 3: character maps to <undefined>

Running it in WSL or using Get-Content cannot-print.md in the same terminal window gives me the emoji as expected.

Expected Results

As this works in the same terminal both with Powershell's Get-Content, and when using WSL to run rich-cli, I'd expect it to work in Windows as well.

Environment

OS: Windows 10 (build 19044.1889) Terminal: Windows Terminal (version 1.14.2281.0) running PowerShell Rich CLI: 1.8.0 Python: 3.10.1

tmr232 commented 1 year ago

Update: This seems to be caused by the current codepage not being 65001 (UTF-8). Setting $env:PYTHONUTF8=1 solves this.

I'm leaving this issue open for 2 reasons:

  1. Maybe there's a way around it, to make it work by default
  2. I assume more people encounter the same issue, and having it documented / printing a suggestion when the error occurs could be useful.
Pebaz commented 1 year ago

Ran into this also, the $env:PYTHONUTF8=1 trick worked for me, although it would be awesome if it could "just work" ™️. It might not be useful, but it looks like you can invoke python with python -X utf8 ... and it does the same trick @tmr232 showed above without modifying the environment.

tmr232 commented 1 year ago

Just verified now - setting the system locale to support UTF8 fixes it as well (as expected), but I'd still prefer rich to "just work" when using Windows Terminal.