In the continuing strategy of "here's a python string literal for a file exhibiting this problem to avoid github being clever", the following string passes ast.parse (in python 3.4.2) but causes yapf to crash when a file with precisely these contents is passed to it and run under the same python version. I think this is because yapf has an assumption baked in that all source is valid utf-8.
INTERNAL ERROR: # ааббббббб
бббббббббб <- Cyrillic characters
Traceback (most recent call last):
File "/home/david/yapf/yapf/yapflib/verifier.py", line 38, in VerifyCode
compile(textwrap.dedent(code).encode('UTF-8'), '<string>', 'exec')
File "<string>", line 2
бббббббббб <- Cyrillic characters
^
SyntaxError: invalid character in identifier
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/david/yapf/yapf/yapflib/verifier.py", line 41, in VerifyCode
ast.parse(textwrap.dedent(code.lstrip('\n')).lstrip(), '<string>', 'exec')
File "/home/david/.pyenv/versions/3.4.2/lib/python3.4/ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<string>", line 2
бббббббббб <- Cyrillic characters
^
SyntaxError: invalid character in identifier
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/david/.pyenv/versions/3.4.2/lib/python3.4/runpy.py", line 170, in _run_module_as_main
"__main__", mod_spec)
File "/home/david/.pyenv/versions/3.4.2/lib/python3.4/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/david/yapf/yapf/__main__.py", line 18, in <module>
sys.exit(yapf.main(sys.argv))
File "/home/david/yapf/yapf/__init__.py", line 102, in main
print_diff=args.diff)
File "/home/david/yapf/yapf/__init__.py", line 124, in FormatFiles
filename, style_config=style_config, lines=lines, print_diff=print_diff)
File "/home/david/yapf/yapf/yapflib/yapf_api.py", line 67, in FormatFile
print_diff=print_diff)
File "/home/david/yapf/yapf/yapflib/yapf_api.py", line 110, in FormatCode
reformatted_source = reformatter.Reformat(uwlines)
File "/home/david/yapf/yapf/yapflib/reformatter.py", line 73, in Reformat
verifier.VerifyCode(formatted_code[-1])
File "/home/david/yapf/yapf/yapflib/verifier.py", line 45, in VerifyCode
compile(normalized_code.encode('UTF-8'), '<string>', 'exec')
File "<string>", line 1
бббббббббб <- Cyrillic characters
^
SyntaxError: invalid character in identifier
In the continuing strategy of "here's a python string literal for a file exhibiting this problem to avoid github being clever", the following string passes ast.parse (in python 3.4.2) but causes yapf to crash when a file with precisely these contents is passed to it and run under the same python version. I think this is because yapf has an assumption baked in that all source is valid utf-8.
String:
Error: