Drekin / win-unicode-console

A Python package to enable Unicode support when running Python from Windows console.
MIT License
103 stars 12 forks source link

win-unicode-console breaks an executed script in Python 3.5 #16

Open e00E opened 8 years ago

e00E commented 8 years ago

I am using Windows 7 64 bit with the 64 bit version of Python 3.5 . I installed win-unicode-console as my usercustomize.

Now another python program called livestreamer (https://github.com/chrippa/livestreamer/) does not work anymore. Starting livestreamer shows this trace:

  File "\python\python35\lib\site-packages\livestreamer_cli\compat.py", line 19,
 in <module>
    stdout = sys.stdout.buffer
AttributeError: 'TextTranscodingWrapper' object has no attribute 'buffer'

You can find the code of livestreamer_cli/compat.py here https://github.com/chrippa/livestreamer/blob/develop/src/livestreamer_cli/compat.py .

According to the readme Doing so should not break executed scripts in any way. Otherwise, it is a bug of win_unicode_console that should be fixed.. I know I can circumvent this issue but it would still be nice to have it fixed.

Drekin commented 8 years ago

Thank you for the report. There is no buffer attribute of TextTranscodingWraper object since there is no actual buffer – it is just a helper layer with encoding="utf-8" since Python tokenizer cannot process utf-16-le. Even without this problem, the actual buffer object is utf-16-le encoded so I don't tnink using it is a good idea. It seems that some implementations just assume ASCII-compatible buffer attribute on sys.stdout, which might not be the case at all.

Maybe I can add some a fake buffer object – yet another compatibility layer with suboptimal implementations. What is the circumvention you are using?

Drekin commented 8 years ago

Does it work when you add

from win_unicode_console import streams
streams.stdout_text_transcoded.buffer = streams.stdout_text_str

?

e00E commented 8 years ago

I just added those to my usercustomize and it works now. Thank you.

import win_unicode_console

win_unicode_console.streams.stdout_text_transcoded.buffer = win_unicode_console.streams.stdout_text_str
win_unicode_console.enable()

By "circumvent", I meant making the unicode fix optin or optout for livestreamer, not a real fix. If livestreamer is using sys.stdout incorrectly then maybe you dont need to fix and it is better if they fix it, but I see you already made a bug report there.

Drekin commented 8 years ago

I don't know how exactly livestreamer uses stdout, but if it interacts with win-unicode-console, it means that stdout leads to Windows console and so it expects text. Using buffer directly would make sense only if you had, for some reason, the text already encoded using the sys.stdout.encoding.

Maybe it is harmless to set the buffer attribute as in the fix for you. Just to be more compatible.

zed commented 8 years ago

If sys.stdout.buffer is available; it should be a binary stream as documented. If you can't provide the buffer then do not set it -- a library code (when the context is not known) should expect that sys.stdout has no buffer attribute sometimes.

Though if you can provide a binary stream (sys.stdout.encoding == 'utf-16le', isinstance(sys.stdout.buffer, io.BufferedIOBase)) then consider doing it even if some code expects only ascii-based encodings (e.g., PYTHONIOENCODING=utf16-le already works on my system, to redirect stdout to a utf-16 text file).

Drekin commented 8 years ago

Yes, a library code should expect that sys.stdout has no buffer attribute. This issue shows it is not always the case. The object used in the quick fix is a binary stream in the sense that its write accepts bytes.

The “ideal” streams objects provided have are completely standard and have standard buffer attribute, the problem is that Python cannot handle utf-16-le on stdin and there is an undocumented constraint that input and output encoding should be the same.

I won't add buffer by default, but maybe, I'll add better error message and include a way to simply apply a quick fix if necessary.

scopatz commented 7 years ago

For what it is worth, this bit us in xonsh.

zed commented 7 years ago

If I understood it correctly, the suggested above workaround makes buffer non-binary that is why I've left my previous comment (I've assumed that stdout_text_str is a text stream).

If'utf-16le' can't be used then perhaps a backport of PEP 528 that uses 'utf-8' might help (utf-8 is decoded to utf-16le before sending it to Unicode API on output and in reverse utf-16le received from the API is encoded to utf-8 on input).

Drekin commented 7 years ago

Yes, one thing is that big part of the problem win_unicode_console is trying to solve will be solved in Python 3.6 by default, which is great news.

Anyway, sometines one may have a text-oriented stream without having the underlying bytes-oriented stream. Adding .buffer for compatibility would mean introducing additional helper bytes-over-text layer. I would think that dealing with sys.std* means dealing with text, so using .buffer seems kind of incorrect, but I may be wrong.

@scopatz, could you explain to me why it is necessary / good idea to consider .buffer in xonsh?