This adds a test case that demonstrates a bug when the following conditions line up:
You have a Git repo with a branch containing a non-ASCII character
Your shell's locale has a non-UTF-8 encoding
You have a distribution of Python that defaults to a non-UTF-8 encoding if the locale-related environment variables (LANG, LC_ALL, others?) don't specify a locale with a UTF-8 encoding
More concretely, I can deterministically break this on Ubuntu 16.04 using the distribution-provided Python 3. I, however, cannot reproduce on my MacBook because Apple's Python 3 defaults to UTF-8.
You can double check if you have a Python installation that is capable of reproducing the problem by looking at the output of the following command:
On Ubuntu 16.04, it returns ANSI_X3.4-1968 for me. On my MacBook, it returns UTF-8.
The more detailed error message is:
File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/sifive/wit/lib/wit/__main__.py", line 11, in <module>
main()
File "/opt/sifive/wit/lib/wit/main.py", line 65, in main
create(args)
File "/opt/sifive/wit/lib/wit/main.py", line 126, in create
update(ws, args)
File "/opt/sifive/wit/lib/wit/main.py", line 308, in update
ws.checkout(packages)
File "/opt/sifive/wit/lib/wit/workspace.py", line 201, in checkout
package.checkout(self.root)
File "/opt/sifive/wit/lib/wit/package.py", line 149, in checkout
self.repo.checkout(self.revision)
File "/opt/sifive/wit/lib/wit/gitrepo.py", line 248, in checkout
proc_ref = self._git_command("show-ref")
File "/opt/sifive/wit/lib/wit/gitrepo.py", line 293, in _git_command
cwd=cwd)
File "/usr/lib/python3.5/subprocess.py", line 695, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/lib/python3.5/subprocess.py", line 1072, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/lib/python3.5/subprocess.py", line 1754, in _communicate
self.stdout.encoding)
File "/usr/lib/python3.5/subprocess.py", line 976, in _translate_newlines
data = data.decode(encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 113387: ordinal not in range(128)
This adds a test case that demonstrates a bug when the following conditions line up:
LANG
,LC_ALL
, others?) don't specify a locale with a UTF-8 encodingMore concretely, I can deterministically break this on Ubuntu 16.04 using the distribution-provided Python 3. I, however, cannot reproduce on my MacBook because Apple's Python 3 defaults to UTF-8.
You can double check if you have a Python installation that is capable of reproducing the problem by looking at the output of the following command:
On Ubuntu 16.04, it returns
ANSI_X3.4-1968
for me. On my MacBook, it returnsUTF-8
.The more detailed error message is:
The issue is that we're setting
universal_newlines=True
in oursubprocess.run()
calls, which will check for the encoding of the current locale usinglocale.getpreferredencoding(False)
. If Git prints out a character that cannot be encoded in ASCII, then Python in an ASCII locale will blow up trying to decode it into a Unicode string.