On Windows with English as the system language (without UTC-8 enabled), the system encoding by default is cp1252 (Western Europe), and Python will use cp1252 as the file encoding by default.
Writing Unicode characters like 汉字 to log file results in error:
Traceback (most recent call last):
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\logging\__init__.py", line 1084, in emit
stream.write(msg + self.terminator)
File "C:\Users\xxx\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 87-88: character maps to <undefined>
The error can also be easily reproduced with
with open("test.txt", "w") as f:
print(f.encoding)
f.write("汉字")
cp1252
Traceback (most recent call last):
File "D:/cli/testproj/test1.py", line 2, in <module>
f.write("汉字")
File "C:\Users\jiasli\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-1: character maps to <undefined>
Change
This PR forces log files to use UTF-8 so that file logging works on non-UTF-8 systems as well.
Context
Reported by https://github.com/Azure/azure-cli/issues/17994
On Windows with English as the system language (without UTC-8 enabled), the system encoding by default is
cp1252
(Western Europe), and Python will usecp1252
as the file encoding by default.Writing Unicode characters like
汉字
to log file results in error:The error can also be easily reproduced with
Change
This PR forces log files to use UTF-8 so that file logging works on non-UTF-8 systems as well.
Alternative solution
One may also follow https://github.com/microsoft/knack/pull/178 to change the default encoding of the system to UTF-8.