Closed GoogleCodeExporter closed 9 years ago
Thanks for the bug report.
Original comment by tart...@gmail.com
on 23 Feb 2011 at 5:57
There are vague reports that fixing the colorama stdout wrapping to happen
during init, rather than at import time, may have implications for unicode
handling.
Original comment by tart...@gmail.com
on 14 Oct 2013 at 7:54
Hi. I'm finally looking at this, but my unicode knowledge is fairly hazy.
I'm trying to reproduce at the interactive interpreter on Win7 in a cmd.exe
running Python2.7, but not having much luck thus far.
If I try your example above:
>>> import colorama
>>> colorama.init()
>>> print
and then right after the print (before a newline) I paste your example unicode
string above, u"Some non-ASCII text ТЕСТ Русского", then my console
doesn't seem to recognise the non-ascii characters in the last two words - they
appear as question marks. Then when I press enter, the print statement executes
fine, without any exception, but the output also just ends in question marks.
If I try some other non-ASCII characters, then this works fine: e.g.
>>> >>> print u"Some non-ASCII \u00e9coöperate"
Some non-ASCII écoöperate
Can anyone help me put together a test case to reproduce the problem? Maybe
using u'\uXXXX' characters?
Original comment by tart...@gmail.com
on 20 Apr 2014 at 9:17
Hmm... Since migrating to linux I have no ready-to-example code :(
1. You should use u'\uXXXX' in interactive console or write simple .py file in
utf-8 encoding.
2. The test string should contain unicode symbols with code > 255. For exmaple
(same word "Test" in English and in Russian) print
u'Test:\u0422\u0435\u0441\u0442'
Original comment by av1024@gmail.com
on 21 Apr 2014 at 5:56
For now I have code above in my colored logger init module. But it is not
tested more than 2 years and I don't remember what issues was here.
def setup_console(sys_enc='utf-8', use_colorama=True):
"""
Set sys.defaultencoding to `sys_enc` and update stdout/stderr writers to corresponding encoding
.. note:: For Win32 the OEM console encoding will be used istead of `sys_enc`
"""
global ansi
reload(sys)
try:
if sys.platform.startswith("win"):
import ctypes
enc = "cp%d" % ctypes.windll.kernel32.GetOEMCP()
else:
enc = (sys.stdout.encoding if sys.stdout.isatty() else
sys.stderr.encoding if sys.stderr.isatty() else
sys.getfilesystemencoding() or sys_enc)
if sys.getdefaultencoding().lower() != sys_enc.lower():
sys.setdefaultencoding(sys_enc)
if sys.stdout.isatty() and sys.stdout.encoding != enc:
sys.stdout = codecs.getwriter(enc)(sys.stdout, 'replace')
if sys.stderr.isatty() and sys.stderr.encoding != enc:
sys.stderr = codecs.getwriter(enc)(sys.stderr, 'replace')
if use_colorama and sys.platform.startswith("win"):
try:
from colorama import init
init()
ansi = True
except:
pass
except:
pass
Original comment by av1024@gmail.com
on 21 Apr 2014 at 6:01
@tartley, regarding comment 3
tartley> and then right after the print (before a newline) I paste your example
unicode string above, u"Some non-ASCII text ТЕСТ Русского", then my
console doesn't seem to recognise the non-ascii characters in the last two
words - they appear as question marks.
This is a fail in the python console paste, as demonstrated by
- copy to clipboard the string "Some non-ASCII text ТЕСТ Русского" including quotes
- in the python console type s = u
- paste after the u the clipboard content
- enter a return to go next line
- print repr(s)
I got
Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> s = u"Some non-ASCII text ???? ????????"
>>> print repr(s)
u'Some non-ASCII text ???? ????????'
>>>
Clearly the paste was unsuccessful, and python sort of sanitized the paste.
(by the way, if you paste in a python script and run in a windows cmd console
the OP case is reproducible, see attached test_decode.py)
A better string is a single i with acute accent like 'í'
>>> import colorama
>>> colorama.init()
>>> s = u'í'
>>> print repr(s)
u'\xed'
>>> print s
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 35, in write
self.__convertor.write(text)
File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 116, in write
self.write_and_convert(text)
File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 143, in write_and_convert
self.write_plain_text(text, cursor, len(text))
File "C:\Python26\lib\site-packages\colorama-0.3.1-py2.6.egg\colorama\ansitowi
n32.py", line 148, in write_plain_text
self.wrapped.write(text[start:end])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xed' in position 0:
ordinal not in range(128)
We found a similar colorama traceback in Nikola, the static blog generator (
https://github.com/getnikola/nikola/issues/1288 )
May I ask what text type expects colorama ? unicode, bytes or should both be
acceptable ? If bytes, it assumes some specific encoding ?
Original comment by ccanepacc@gmail.com
on 18 May 2014 at 1:45
Attachments:
Migrated to https://github.com/tartley/colorama/issues/36
closing as duplicate.
Original comment by tart...@gmail.com
on 18 Feb 2015 at 1:51
Original issue reported on code.google.com by
av1024@gmail.com
on 23 Feb 2011 at 4:30