Closed noxecane closed 8 years ago
This is not a click problem but a python 3 issue. Either use Python 2.7 or reconfigure your environment to a locale supported on your machine with utf-8. There is nothing I can do in Click to fix this.
The mitigation steps are not working(and I can't use python 2).Isn't there a way to enforce locale in python
You cannot fix this from within Python. This is a problem with the interpreter. The only way to fix this is to reconfigure the environment.
No issue......
@danceasarxx I'm not sure what you mean.
I've given up on using it....decided to use plac..and that's how we say no problem down here...
In case someone else stumbles upon this in the future: exporting the locale is the correct way to fix it as otherwise stdout will have the wrong encoding set (and the fs encoding on linux). If exporting the values does not fix it, then most likely the locale is missing. In this case the machine might not have an en_US
locale installed in which case Python falls back to C which again is ASCII.
Indeed this is true, but your suggestion Either switch to Python 2
is biased.
I guess this is documented here http://click.pocoo.org/5/python3/#python-3-surrogate-handling?
@danceasarxx the name of the locale is en_US.utf8
(without a hyphen). Click is protecting you from running a faulty setup here.
You can double check which locales are available using locale -a
.
@hynek on which system is utf-8
not a valid charset name in locales? I cannot reproduce that on any Linux build. It's more likely the system does not have utf-8 locales configured at all in that case.
fyi, the reported locale names on a linux machine contain .utf8
:
$ locale -a | grep -i en.*utf
en_GB.utf8
en_US.utf8
and on mac osx they contain .UTF-8
:
$ locale -a | grep -i en.*utf
en_AU.UTF-8
en_CA.UTF-8
en_GB.UTF-8
en_IE.UTF-8
en_NZ.UTF-8
en_US.UTF-8
Though linux machines accept utf-8
just fine:
$ date
Thu 5 Nov 10:48:09 UTC 2015
$ LANG=nl_NL.utf8 date
do nov 5 10:48:12 UTC 2015
$ LANG=nl_NL.UTF-8 date
do nov 5 10:48:18 UTC 2015
The error message is dynamic now and should help debugging this issue: https://github.com/mitsuhiko/click/blob/40705b9d69e78e599c26b8e55c828ae19bd5ed0c/click/_unicodefun.py#L42-L108
@hynek Changing it to en_US.utf8 did work. Thanks a lot
@danceasarxx on which server operating system?
centos 7. It's actually on a docker.
No idea why UTF8 would fix anything but if you are using docker you need to configure locales. By default docker boots up in ASCII mode and does not have any locales unless you run some custom image.
On a base centos image
[root@7d3c22fe87eb inet]# locale -a
C
POSIX
en_US.utf8
Hello @mitsuhiko , I need one more clarification on the topic;
I think I understand quite a lot of things on encoding and Unicode issues (I even have a couple of blog posts on how it works in Python 2.x vs Java on my blog, if you need to check) - this is not to say that I know everything or that I'm the Most Competent Person In The World on the topic, but that I'm not an absolute beginner.
I have read the section on surrogate handling as well as the open bugs linked in that section.
Everywhere you keep saying "you need to configure locales" and you treat ASCII as if it were a disease.
AFAICU the issue is that, since strings object are actually unicode objects in Python 3.x, there's a struggle because if there's some non-ASCII char on the command line (or any string to be printed contains non-ASCII chars) then, effectively, you don't know what to do and how to handle the situation. This seems complicated even further by Python issue 8776 (sys.argv decoding).
By the way I would understand the error 100% if that happened when sending non-ASCII chars to the command line, or when trying to print non-ASCII chars, or when detecting non-ASCII chars anywhere. THAT SITUATION would ABSOLUTELY require to raise an exception.
At RUNTIME.
On the other hand.. what if I'm 100% safe that everything is ASCII? Because I only used ascii in my software, and it makes zero sense for users to employ other charsets?
Is that a "preventive war"? I refuse to work because for some arguments I might not work?
PS Of course I took the time to "hotpatch" click with
from click import core
core._verify_python3_env = lambda: None
And I found my app works as I expect, that's why I'm allowing myself to raise some doubts on this implementation.
As you note in the documentation, the issue is especially nasty in crontabs/init scripts/etc, places where you have ZERO user input (so it's quite controlled and you know there's nothing else than ASCII) and your output is most probably to a file where you log (and, there, you can choose the encoding you like). So, I cannot really find a purpose for such check.
Checking in here. It looks like some of the conversation here is a bit contentious, so I apologize for bringing this up again.
First, let me say, thank you so much @mitsuhiko for maintaining this library. I, and many others that I know, really appreciate your hard and unpaid work here.
I write and maintain a library that gets used on a lot of old supercomputing systems, many of which don't have locales set, and so my users run into this problem frequently. We've handled this situation so far with documentation and informative error messages, but still they persist in having issues. The problem is that many of our users aren't sufficiently sophisticated to address this issue on their own, and so it becomes a major pain point that I'm not sure how to address.
While I disagree with @alanfranz 's tone, I'm curious about his solution, and if there is a hack that I can do to opt-out of this verification. This is a compromise that I might be willing to make, and that I think would give my users a better experience.
However, I'm somewhat concerned about the reliance on the private function, and whether or not this function might move disappear in the future. So, question for @mitsuhiko , is it reasonable for click to offer a long-term-stable opt-in mechanism for downstream library authors to disable this check?
@mitsuhiko @mrocklin Re-reading myself after a couple of years, I agree that my tone sounds confrontational, and I apologize about that. I don't remember being angry at the time, I was probably just trying to be a little too assertive.
My points hold, btw. I don't think there's an actual need for the pre-run checks that are being performed by click. It could just crash at runtime if something is wrong. I'd be happy to contribute a patch, if that will be looked at (i.e. if that won't be dismissed as 'it's a python3 problem just install locales').
After several unsuccessful attempts fiddling about with the LC_*
variables I dug around Google again. And via an answer on StackOverflow I came across PYTHONIOENCODING. And indeed, setting this to UTF-8
fixed the issue for my case.
In my case, I am running inside a docker container in a GitLab pipeline. And setting LC_ALL
and LANG
to C.UTF-8
did not help at all even though the locale is available:
$ locale -a
C
C.UTF-8
POSIX
Setting PYTHONIOENCODING
finally fixed the issue for me.
No idea why UTF8 would fix anything but if you are using docker you need to configure locales. By default docker boots up in ASCII mode and does not have any locales unless you run some custom image.
I would guess the most containers don't have locales configured, especially ones that try to keep size minimal. This makes click unusable in containerized environments.
https://github.com/pallets/click/issues/448#issuecomment-246029304
Thanks for the workaround @alanfranz
I was trying to use supervisor to start an application that uses click and I got this
I tried setting the locale using
Supervisor
Bash
All to nothing. Note that I am using a virtualenv and python3.4