Drekin / win-unicode-console

A Python package to enable Unicode support when running Python from Windows console.
MIT License
103 stars 12 forks source link

Error importing package on Cygwin and Linux #8

Closed thorstenkampe closed 8 years ago

thorstenkampe commented 9 years ago

win-unicode-console claims compatibility with colorama. It would be nice if you could import win-unicode-console on other platforms than Windows without an exception; just like colorama: "On other platforms, Colorama does nothing."

thorsten@rednails zsh> python3                                                                                                                     [~]Python 3.2.5 (default, Jul 25 2014, 14:13:17)
[GCC 4.8.3] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import win_unicode_console
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.2/site-packages/win_unicode_console-0.3.1-py3.2.egg/win_unicode_console/__init__.py", line 2, in <module>
    from win_unicode_console import streams, console, readline_hook
  File "/usr/lib/python3.2/site-packages/win_unicode_console-0.3.1-py3.2.egg/win_unicode_console/streams.py", line 2, in <module>
    from ctypes import byref, windll, c_ulong
ImportError: cannot import name windll
Drekin commented 9 years ago

The problem win-unicode-console is trying to solve is specific to Windows console. What is your use case of importing the package on Cygwin or Linux?

zed commented 8 years ago

@Drekin win-unicode-console should do nothing unless the input/output is Windows console (colorama module is a good example) i.e., there is no reason to fail on Unix. The use-case is writing scripts that can print Unicode on multiple platforms without requiring users to run the scripts using py -mrun print_unicode.py on Windows.

Also, win-unicode-console may provide useful behavior even on Unix (or if the output is redirected to a file/pipe on any OS) e.g., if the environment is misconfigured by accident to use ASCII locale (ssh, sudo) then win-unicode-console may allow to force utf-8 anyway (via opt-in parameter). Thus Python script may print Unicode in whatever environment.

Drekin commented 8 years ago

I don't understand. If you want to write a multiplatform script printing unicode, you don't have to import anything, it should be up to consumer to fix his Python environment by installing win-unicode-console and enabling it (on Windows) in his sitecustomize. I.e. I don't think it is a good idea to import win-unicode-console in a consumer script. But you can still do it in a multiplatform way – just make it conditional, testing the platform.

On the other hand, if you want to write a multiplatform Unicode-fixing package, you should import each platform-specific solution depending on which platform you currently are. Wanting five platform specific packages to do nothing on other platforms just for a possibility to import them all and then enable them all seems wrong to me.

win-unicode-console may provide useful behavior even on Unix, but I don't think it currently does. Its core functionality is to replace the standard streams with ones based on winapi functions ReadConsoleW and WriteConsoleW in interactive console. That makes no sense on other platforms. Output redirection may bring a problem completely different to the one this package is trying to solve.

Also, what exactly are you proposing? Should I permit importing win-unicode-console but make its enable function raise an exception on other platforms? Or should enable do nothing on other platforms? Note that enable is just a composition of enabling functions for different components. What about them? Should they raise or do nothing? What about disabling functions? Also note that imporing win_unicode_console currently means importing its streams module which means importing ctypes.windll.kernel32, which is an error on other platforms. I.e. the proposal means a lot of decisions and possibly a lot of platform checks spread thorough the package just to support void functionality on other platforms for a reason not still apparent to me.

I'm not saying you're not right, I just don't see it. So let's continue with the discussion.

zed commented 8 years ago

..it should be up to consumer to fix his Python environment by installing win-unicode-console and enabling it (on Windows) in his sitecustomize. I.e. I don't think it is a good idea to import win-unicode-console in a consumer script.

Yes, in the ideal world win-unicode-console should be completely transparent for the Python code that performs Unicode I/O and win-unicode-console would be shipped with Python. It is too intrusive otherwise to force users to modify their global environment in a way that might break other Python scripts. win-unicode-console should provide a solution even for those people who won't modify sitecustomize, usercustomize, and the like. Usually, I recommend py -mrun approach but it is nice to have an option to enable it from within a script too.

..But you can still do it in a multiplatform way – just make it conditional, testing the platform.

Yes, I can do it as well as all other developers that use your module and need a cross-platform support. Or we can do it in a single place:

try:
    from win_unicode_console import streams, console, readline_hook
except (AttributeError, ImportError):
    enable = lambda **kw: None
    disable = lambda: None
else: # the rest of the win-console-unicode/__init__.py

that is all: win-unicode-console should provide useful behavior on Unix, namely: it should do nothing there -- in particular, it should avoid ImportError.

Any additional behavior would be an enhancement. win-unicode-console mission: "enable Unicode input and display when running Python from Windows console" could be extended: "enable Unicode input and display when running Python."

Currently, Python unreasonably respects misconfigured ASCII locale settings. Usually, PYTHONIOENCODING could be used to force utf-8. As an enhancement, win-unicode-console may provide a similar (but perhaps more aggresive and less general) option from within a script. See related discussion.

Example:

#!/usr/bin/env python
import codecs
import sys
import win_unicode_console # $ pip install win_unicode_console

broken_environment = (sys.getfilesystemencoding() == 'ascii' and
    codecs.lookup(sys.stdout.encoding) == codecs.lookup('ascii'))
win_unicode_console.enable(use_utf8=broken_environment)
print(u"\N{SNOWMAN}")

The only difference compared to the current implementation is that it enables utf-8 streams on top of binary API instead of utf-16le streams on top of Unicode API. Use-case: enable Unicode input and display in Python REPL started via ssh on a remote Linux box.

The preferred solution would be to fix the environment but we live in an imperfect world where it is not always possible.

use_utf8 is an enhancement. Just "do nothing on Unix" option would be fine.

Drekin commented 8 years ago

@zed

..it should be up to consumer to fix his Python environment by installing win-unicode-console and enabling it (on Windows) in his sitecustomize. I.e. I don't think it is a good idea to import win-unicode-console in a consumer script.

Yes, in the ideal world win-unicode-console should be completely transparent for the Python code that performs Unicode I/O and win-unicode-console would be shipped with Python.

That may eventually happen. Note that win_unicode_console is rather a collection of workarounds, and a proper Python fix would make it so no workarounds are needed, so win_unicode_console won't be needed as well.

Meanwhile in an almost ideal world a person having troubles with Unicode in Python in Windows console would find win_unicode_console a fix their Python environment by enabling it in sitecustomize. I see no obstacles to this.

It is too intrusive otherwise to force users to modify their global environment in a way that might break other Python scripts.

Why? win_unicode_console should not break other Python scripts and it intended to work as a global fix. If there are any problems that are win_unicode_console's fault, it is a bug that should be fixed. One can easily change his sitecustomize again if there are problems. There will also be an option to use the runner to run a script with win_unicode_console disabled.

win-unicode-console should provide useful behavior on Unix, namely: it should do nothing there -- in particular, it should avoid ImportError.

Note if you import and enable win_unicode_console unconditionally in your script, you are adding an unnecessary dependency. On one side, there are Windows Python users who want to use your Unicode-printing script, but are reluctant to enabling win_unicode_console in sitecustomize or using the runner, on the other side there are Unix, Mac, and other platform Python users who want to use your Unicode-printing script and have no problems with Unicode in their consoles. I don't thing we should bother the second group by win_unicode_console dependency just for some special-cased convenience to the first group.

At least you should do something like

try:
    import win_unicode_console
except ImportError:
    pass
else:
    win_unicode_console.enable()

and tell your users to install win_unicode_console by themselves. It would be also nice to inform them about the possibility of importing win_unicode_console in sitecustomize to solve Unicode console problems not only in your script, but in all scripts at once.

Regarding the locale setting problem on Unix, even if I was to solve that issue (and I don't think I'm the right person since I know nothing about Unix locales nor have I a Unix machine for testing), I would make it a separate package as it aims to solve a separate issue.

Any additional behavior would be an enhancement. win-unicode-console mission: "enable Unicode input and display when running Python from Windows console" could be extended: "enable Unicode input and display when running Python."

Again, for that use case there should be a separate package fix_unicode that collects information about various Unicode-related fixes for various platforms, has platform-dependent dependencies, and its enable function would enable only the relevant fixes. win_unicode_console is just a package for solving one particular issue that occurs on one particular platform. I hope it's clear that the “win” in the name stands also for “Windows”.

Nevertheless, I have made changes so that win_unicode_console is as multiplatform as possible. I still think that using it as a dependency in consumer scripts is a bad idea. On the other hand, exposing the additional functionality that is multiplatform, so one can experiment also on Unix, shouldn't hurt.

Could you please import and enable win_unicode_console from the development branch in both Python 3 and Python 2 to make sure that I didn't miss any obstacle to importing it on Unix?

zed commented 8 years ago

You may recommend users to change their sitecustomize but you should not force them to do it in order to use your package.

I agree that win-unicode-console is an unnecessary dependence on Unix if it will never do anything there and anybody who uses win-unicode-console in their scripts should optimize it out: create a separate requirements.txt file for Unix and add try/except to the scripts. It is a "nice to have" option that I could take a Python script created on Windows and run it on Unix as is, even if the author hasn't expected that the script will be run on other systems. If it would introduce a maintenance burden or may degrade Windows experience; don't do it.

Could you please import and enable win_unicode_console from the development branch in both Python 3 and Python 2 to make sure that I didn't miss any obstacle to importing it on Unix?

$ python3 -mpip install -U https://github.com/Drekin/win-unicode-console/archive/development.zip
$ python3 -c "import win_unicode_console"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/me/.virtualenvs/py3/lib/python3.4/site-packages/win_unicode_console/__init__.py", line 2, in <module>
    from . import streams, console, readline_hook
  File "/home/me/.virtualenvs/py3/lib/python3.4/site-packages/win_unicode_console/streams.py", line 5, in <module>
    from ctypes import byref, windll, c_ulong
ImportError: cannot import name 'windll'

The same result on both Python 2 and Python 3. Use the snippet that I've suggested above instead:

try:
    from win_unicode_console import streams, console, readline_hook
except (AttributeError, ImportError):
    enable = lambda **kw: None
    disable = lambda: None
else: # the rest of the win-console-unicode/__init__.py without any changes
Drekin commented 8 years ago

Since we want the package to be importable on other platforms, we may do it in a way that also the individual submodules are importable. There should be no exception when importing a submodule unless it has a clear message as "don't do this on Python 2" or "don't do this on Unix". I think I just forgot to remove the unconditional import of windll in streams module. Could you please try again?

zed commented 8 years ago

Don't sprinkle the platform/version checks all over the code; it is brittle and unsustainable over time especially if there are no extensive automatic tests.

For example, the simple and robust approach in PR #13 supports pypy, jython automatically. If you want to support individual submodules; you could apply the same approach as in __init__.py: try/except + trivial stubs for public API.

Drekin commented 8 years ago

The point is that not only enable and disable methods are public api, everything that may be reused could be public api. The version checks are needed to support both Python 2 and Python 3. The platform check in readline_hook is needed to support custom readlines on Unix (of course it should be more robust, and in fact readline hooks should be made a standalone package, but I think this is enough for now). Then there are two trivial platform checks in __init__ and enable and disable functions retain their metadata. Then there are three platform checks in streams that do the key work. All the other modules should be multiplatform.

Note that your approach may also hide actual bugs. Also, the functionality of this package heavily depends on ctypes and Python C API. So no wonder if it doesn't work on PyPy or Jython. It is an external fix of CPython Windows installations, some of whose features might be of interest also for Unix CPython instalations. It is not intented to be a dependency of multiplatform scripts.