Drekin / win-unicode-console

A Python package to enable Unicode support when running Python from Windows console.
MIT License
103 stars 12 forks source link

Any way to disable readline hook warning? #36

Closed thebjorn closed 6 years ago

thebjorn commented 7 years ago

My sitecustomize.py file:

import win_unicode_console
win_unicode_console.enable()

content of foo.py:

import sys
for line in sys.stdin:
    sys.stdout.write(line)

running without PYTHONIOENCODING:

(dev) go|c:\srv\tmp\wuc> set PYTHONIOENCODING
Environment variable PYTHONIOENCODING not defined

(dev) go|c:\srv\tmp\wuc> type foo.py | python foo.py
c:\srv\venv\dev\lib\site-packages\win_unicode_console\__init__.py:31: RuntimeWarning: sys.stdin.encoding == None, whereas sys.stdout.encoding == 'utf-8', readline hook consumer may assume they are the same
  readline_hook.enable(use_pyreadline=use_pyreadline)
import sys
for line in sys.stdin:
    sys.stdout.write(line)

(dev) go|c:\srv\tmp\wuc>

and with PYTHONIOENCODING=utf-8:

(dev) go|c:\srv\tmp\wuc> set PYTHONIOENCODING=utf-8

(dev) go|c:\srv\tmp\wuc> type foo.py | python foo.py
c:\srv\venv\dev\lib\site-packages\win_unicode_console\__init__.py:31: RuntimeWarning: sys.stdin.encoding == None, whereas sys.stdout.encoding == 'utf-8', readline hook consumer may assume they are the same
  readline_hook.enable(use_pyreadline=use_pyreadline)
import sys
for line in sys.stdin:
    sys.stdout.write(line)

The warning is coming from sitecustomize.py, so I'm assuming I have to check if sys.stdin is coming from a pipe? (I first tried to check if sys.stdin.encoding is None -- but both it and stdout.encoding are always None in sitecustomize).

This is my latest attempt:

import sys
if sys.stdin.isatty():
    import win_unicode_console
    win_unicode_console.enable()

This works for the foo.py from above, but fails with mojibake if I change foo.py to read:

# -*- coding: utf-8 -*-
import sys
for line in sys.stdin:
    sys.stdout.write(line)
print u'✓'

i.e. adding an encoding comment, and a unicode literal (checkmark on last line).

If I change the code page to 65001 it does work:

(dev) go|c:\srv\tmp\wuc> type foo.py | python foo.py
# -*- coding: utf-8 -*-
import sys
for line in sys.stdin:
    sys.stdout.write(line)
print u'Γ£ô'
Γ£ô

(dev) go|c:\srv\tmp\wuc> chcp 65001
Active code page: 65001

(dev) go|c:\srv\tmp\wuc> type foo.py | python foo.py
# -*- coding: utf-8 -*-
import sys
for line in sys.stdin:
    sys.stdout.write(line)
print u'✓'
✓

(dev) go|c:\srv\tmp\wuc> type foo.py | python foo.py > bar.py

(dev) go|c:\srv\tmp\wuc> type bar.py
# -*- coding: utf-8 -*-
import sys
for line in sys.stdin:
    sys.stdout.write(line)
print u'✓'
✓

..but win-unicode-console is probably not involved in any of that, correct?

Drekin commented 7 years ago

Hello, this looks like a bug. And the bug is not necessarily the warning. The point is that when you redirect a stream, then 1) that stream shouldn't be fixed by WUC at all, and 2) the readline hook shouldn't be invoked (at least if this hook is input hook).

I'll look at this, but I'm not sure when. The point is WUC has to detect various conditions and behave appropriately – whether the streams are the standard ones and should be fixed, whether something has been redirected, and so on. Some of these checks are in streams module. If you want to experiment you can see what the methods of streams.STDIN and so on return in your case.

thebjorn commented 7 years ago

ok, so I returned foo.py to:

import sys
for line in sys.stdin:
   sys.stdout.write(line)

and changed sitecustomize.py to:

# -*- coding: utf-8 -*-

try:
    import sys
    import win_unicode_console as wuc
    from ctypes import (
        byref, c_ulong, WinDLL, get_last_error, set_last_error, WinError
    )

    errcode = {
        0: 'ERROR_SUCCESS',
        6: 'ERROR_INVALID_HANDLE',
        8: 'ERROR_NOT_ENOUGH_MEMORY',
        995: 'ERROR_OPERATION_ABORTED',
    }
    kernel32 = WinDLL('kernel32', use_last_error=True)
    GetConsoleMode = kernel32.GetConsoleMode

    IN = wuc.streams.STDIN
    print 'name:', IN.name
    print 'fileno:', IN.fileno
    print 'handle:', IN.handle
    print 'stream:', IN.stream
    print 'stream.isatty:', IN.stream.isatty()
    print 'fileno:', IN.stream.fileno()
    print 'GetConsoleMode:', GetConsoleMode(IN.handle, byref(c_ulong()))
    errno = get_last_error()
    print 'last_error:', errno, errcode[errno]
    print '--'
    print 'tty:', IN.is_a_TTY()
    print 'console:', IN.is_a_console()
    print 'fixed:', IN.should_be_fixed()
    print '\n'*3
    wuc.enable()
except ImportError:
    print("""

       You need to run `pip install win-unicode-console` in this venv."

    """)

results:

c:\srv\tmp\wuc> chcp
Active code page: 437

c:\srv\tmp\wuc> set PYTHONIOENCODING
Environment variable PYTHONIOENCODING not defined

c:\srv\tmp\wuc> python
name: stdin
fileno: 0
handle: 8
stream: <open file '<stdin>', mode 'r' at 0x02CED020>
stream.isatty: True
fileno: 0
GetConsoleMode: 1
last_error: 0 ERROR_SUCCESS
--
tty: True
console: True
fixed: True

Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> ^Z

c:\srv\tmp\wuc> type foo.py | python foo.py
name: stdin
fileno: 0
handle: 8
stream: <open file '<stdin>', mode 'r' at 0x0307D020>
stream.isatty: False
fileno: 0
GetConsoleMode: 0
last_error: 6 ERROR_INVALID_HANDLE
--
tty: False
console: False
fixed: False

c:\srv\venv\dev\lib\site-packages\win_unicode_console\__init__.py:31: RuntimeWarning: sys.stdin.encoding == None, whereas sys.stdout.encoding == 'utf-8', readline hook consumer may assume they are the same
  readline_hook.enable(use_pyreadline=use_pyreadline)
import sys
for line in sys.stdin:
   sys.stdout.write(line)

c:\srv\tmp\wuc>

setting PYTHONIOENCODING makes no differenc

c:\srv\tmp\wuc> set PYTHONIOENCODING=utf-8

c:\srv\tmp\wuc> type foo.py | python foo.py
name: stdin
fileno: 0
handle: 8
stream: <open file '<stdin>', mode 'r' at 0x034BD020>
stream.isatty: False
fileno: 0
GetConsoleMode: 0
last_error: 6 ERROR_INVALID_HANDLE
--
tty: False
console: False
fixed: False

c:\srv\venv\dev\lib\site-packages\win_unicode_console\__init__.py:31: RuntimeWarning: sys.stdin.encoding == None, whereas sys.stdout.encoding == 'utf-8', readline hook consumer may assume they are the same
  readline_hook.enable(use_pyreadline=use_pyreadline)
import sys
for line in sys.stdin:
   sys.stdout.write(line)

c:\srv\tmp\wuc>
thebjorn commented 7 years ago

This instrumented streams.enable():

def enable(stdin=Ellipsis, stdout=Ellipsis, stderr=Ellipsis):
    if not WINDOWS:
        return

    # defaults
    if PY2:
        print 1
        if stdin is Ellipsis:
            print 2
            stdin = stdin_text_fileobj
        if stdout is Ellipsis:
            print 3
            stdout = stdout_text_str
        if stderr is Ellipsis:
            stderr = stderr_text_str
    else: # transcoding because Python tokenizer cannot handle UTF-16
        if stdin is Ellipsis:
            stdin = stdin_text_transcoded
        if stdout is Ellipsis:
            stdout = stdout_text_transcoded
        if stderr is Ellipsis:
            stderr = stderr_text_transcoded
    print 4
    if stdin is not None and STDIN.should_be_fixed():
        print 5
        sys.stdin = stdin
    print 6
    if stdout is not None and STDOUT.should_be_fixed():
        print 7
        sys.stdout.flush()
        sys.stdout = stdout
    if stderr is not None and STDERR.should_be_fixed():
        sys.stderr.flush()
        sys.stderr = stderr

prints: 1, 2, 3, 4, 6, 7 ie. sys.stdin is left alone (i.e. sys.stdin.encoding == None) while sys.stdout is "fixed" (`sys.stdout.encoding == 'utf-8'). This should all be copacetic.

In __init__.py's enable(..., use_readline_hook=True, use_pyreadline=True, ...):

    streams.enable(stdin=stdin, stdout=stdout, stderr=stderr)

    if use_readline_hook:
        readline_hook.enable(use_pyreadline=use_pyreadline)

The streams are all kosher after the first line, though not compatible with readline - not being interactive and all, and the first thing readline_hook.enable(...) does is to compare encodings on stdin/out and print the warning.

I can't say I understand the code in any depth, but perhaps it's enough to change the if to:

    if use_readline_hook and streams.STDIN.should_be_fixed():
        readline.hook.enable(...)
Drekin commented 7 years ago

Thank you for your effort. You are right, not setting up the custom readline hook in the case that stdin is redirected is a solution. Your case also shows another problem – since the redirected stdin is not a TTY, the readline hook (either the original one or the istalled custom one) is not called at all – the warning comes from the installation routine, so it shouldn't be there.

You can manually fix your installed WUC by the proposed modification:

if use_readline_hook and streams.STDIN.should_be_fixed():
    readline_hook.enable(...)

Independently, the check should be removed from readline_hook.py:

def enable(use_pyreadline=True):
    # check_encodings()

If it doesn't break anything, I'll add the changes to the next version.