guyzmo / git-repo

Git-Repo: CLI utility to manage git services from your workspace
https://webchat.freenode.net/?channels=#git-repo
Other
842 stars 85 forks source link

Problems with non-ascii #164

Closed hmijail closed 7 years ago

hmijail commented 7 years ago

When I try to run the git repo config, when I log in to my service (a self-hosted GitLab), it fails with

Fatal error: 'ascii' codec can't encode character '\U0001f37b' in position 30: ordinal not in range(128)

That unicode character seems to be "clinking beer mugs". I don't know where it comes from.

hmijail commented 7 years ago

Running verbose, there is an additional stacktrace:

------------------------------------
Traceback (most recent call last):
  File "/Users/mija/Library/Python/3.5/lib/python/site-packages/git_repo/repo.py", line 583, in main
    return GitRepoRunner(args).run()
  File "/Users/mija/Library/Python/3.5/lib/python/site-packages/git_repo/kwargparse.py", line 68, in run
    return self._action_dict[frozenset(args)](self)
  File "/Users/mija/Library/Python/3.5/lib/python/site-packages/git_repo/repo.py", line 575, in do_config
    setup_service(service)
  File "/Users/mija/Library/Python/3.5/lib/python/site-packages/git_repo/repo.py", line 548, in setup_service
    print('Great! You\'ve been identified \U0001f37b')
UnicodeEncodeError: 'ascii' codec can't encode character '\U0001f37b' in position 30: ordinal not in range(128)
guyzmo commented 7 years ago

heeeerm… there is absolutely no reasons why this wouldn't work with python3.5. The worst case would be that you'd have an ugly boxy character instead of 🍻.

Because print() is supposed to get unicode strings and render unicode strings, and not ascii.

hmijail commented 7 years ago

OK, I'll tell that to my Python 3.5 installation. I'm sure that will convince it to start working.

hmijail commented 7 years ago

In case it helps,

$ python --version
Python 3.5.3
guyzmo commented 7 years ago

how are you connecting to your shell? is it just a simple ssh or using something special?

Kwpolska commented 7 years ago

How is your locale configured? What does locale say?

hmijail commented 7 years ago

I'm using iTerm2.app in macOS Sierra, everything local.

$ locale
LANG=
LC_COLLATE="C"
LC_CTYPE="C"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL="C"
guyzmo commented 7 years ago

ycan you run:

% python
>>> print(sys.stdout.encoding)

and give me the output of that?

Kwpolska commented 7 years ago

The C locale assumes an ASCII-only terminal. You need to change it to something more suitable. Put this in your shell configuration (eg. ~/.bash_profile, depending on shell and its config):

export LC_ALL='en_US.UTF-8'

This will switch your terminal encoding to something that supports UTF-8. (Feel free to change en_US to anything else listed by locale -a)

hmijail commented 7 years ago
$ python -c "import sys; print(sys.stdout.encoding)"
US-ASCII

Doing export LC_ALL='en_US.UTF-8' did indeed allow git repo config to finish successfully.

Regarding Python and locales, anyway, I guess I should mention that I had such locale precisely because of previous problems with Python: see https://bugs.python.org/issue18378 .

Anyway, I'll see if I can live with that LC_ALL.

hmijail commented 7 years ago

(and of course, thank you both for your help!)

Kwpolska commented 7 years ago

Setting locale should not break any other apps, and it helps with many. It’s kinda strange that macOS doesn’t set a correct one, but fixing it manually in shell config is simple.

(Close the issue if it’s solved)

hmijail commented 7 years ago

AFAIU, macOS (or rather the terminal application) does set a locale; but it is a BSD one ("UTF-8"), while that Python bug report I linked to explains that Python only reacts well to GNU locales.

So I will have to check that this locale is acceptable both to Python and to the rest of the BSD-like macOS environment.

That's why I configured a locale that was OK for both environments.

guyzmo commented 7 years ago

hm… the emoji stuff are just for the fun, and they should not break the working of the program. I'll try to figure out a way to conditionally output emojis or ascii art depending on sys.stdout.encoding — there might be a lib for that?.

Kwpolska commented 7 years ago

You don’t need a library — just check if the encoding is UTF-8, or alternatively consider not using emoji in output.

hmijail commented 7 years ago

For completeness, and again AFAIU: the terminal apps set a locale depending on the region and language. If there is a mismatch between them (like me, setting English as language but Region as something else), the "UTF-8" locale is set as a fallback, and this one is OK for BSDs but not for GNU. And Python only understands GNU.

So only ...

... will find this kind of problem.

export LC_ALL='en_US.UTF-8' does seem to be valid everywhere, so I'm adopting that.