tartley / colorama

Simple cross-platform colored terminal text in Python
BSD 3-Clause "New" or "Revised" License
3.56k stars 252 forks source link

UnicodeEncodeError for "\ucf62" with codec "gbk" #245

Open HelloGwkki opened 4 years ago

HelloGwkki commented 4 years ago

I'm using neofetch-win, and it's fine by itself, but there seems to be a problem using the colorama module. Moreover, this error only happens here in git bash (minitty), there is no problem on cmd. OSTraceback (most recent call last): File "C:\Users\Administrator\AppData\Local\Programs\Python\Python37-32\Scripts\neofetch-script.py", line 11, in <module> load_entry_point('neofetch-win==1.1.0', 'console_scripts', 'neofetch')() File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\neofetch_win\main.py", line 59, in main shell() File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\neofetch_win\main.py", line 54, in shell print(nf.pretty_print(), file=nf.stream) File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\colorama\ansitowin32.py", line 41, in write self.__convertor.write(text) File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\colorama\ansitowin32.py", line 162, in write self.write_and_convert(text) File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\colorama\ansitowin32.py", line 187, in write_and_convert self.write_plain_text(text, cursor, start) File "c:\users\administrator\appdata\local\programs\python\python37-32\lib\site-packages\colorama\ansitowin32.py", line 195, in write_plain_text self.wrapped.write(text[start:end]) UnicodeEncodeError: 'gbk' codec can't encode character '\ucf62' in position 12: illegal multibyte sequence i want on git bash using neofetch-win

wiggin15 commented 4 years ago

This seems to be related to the long-standing issue #36 . We are printing a unicode character that the terminal might recognize but sys.stdout encoding does not (in this case the character is "\ucf62" which is a Chinese character, but the 'gbk' encoding, which is a Chinese encoding that is apparently set as the encoding of sys.stdout, cannot encode). I'm not sure why (or even entirely sure if) there are different encodings between the terminal and "sys.stdout". It looks like Python itself is setting the wrong encoding for "sys.stdout" (we're not changing it) so writing to it fails.

HelloGwkki commented 4 years ago

这似乎与长期存在的问题#36有关。 我们正在打印一个终端可能会识别的unicode字符,但是sys.stdout编码不能识别(在这种情况下,该字符是“ \ ucf62”,这是一个中文字符,但是却在打印“ gbk”编码,这显然是一个中文编码)设置为sys.stdout的编码,无法编码)。我不确定为什么(甚至完全确定是否)终端和“ sys.stdout”之间有不同的编码。看来Python本身为“ sys.stdout”设置了错误的编码(我们没有更改),因此写入失败。

Oh, thank you! However, when I used neofetch-win, there was no error on CMD and a strange character was output, which I suspect is what you call \UCF62 character. Although I am using Chinese version of Windows, this character does not exist in common Chinese characters. However, this special character cannot be displayed in Powershell, but you can see the Chinese character by selecting it.

HelloGwkki commented 4 years ago

\ucf62

is

?

wiggin15 commented 4 years ago
>>> print('\ucf62')
콢
HelloGwkki commented 4 years ago
>> >  打印(' \ ucf62 ')
콢

hmmmmm,i can't enter this str.