aws / aws-cli

Universal Command Line Interface for Amazon Web Services
Other
15.48k stars 4.11k forks source link

encoding problem of aws cloudformation package #7142

Open kidotaka opened 2 years ago

kidotaka commented 2 years ago

Describe the bug

Even if we specify AWS_CLI_FILE_ENCODING=utf-8, aws cloudformation package with --output-template-file create a non utf-8 output. It depends on locale because python build-in open function is used. Input template file is utf-8, but output template file is non utf-8. This sometime leads an encode error.

When we use nested stacks, aws cli create a temporary file without an encoding before uploading. There is the same encoding problem.

Expected Behavior

aws cloudformation package command should create a file with an encoding of AWS_CLI_FILE_ENCODING.

Current Behavior

Encode Error occur. 'cp932' codec can't encode character '\U0002000b' in position 186: illegal multibyte sequence In this case, I used aws cli on Winodws.

Reproduction Steps

Possible Solution

I suggest that use compat_open() or getpreferredencoding() of aws-cli/compat.py for an output file encoding.

built-in open function is used in

Work arounds:

Additional Information/Context

Conjunction use of "AWS_CLI_FILE_ENCODING" and "LC_CTYPE" is not a better solution. Because fewer locales are available for "LC_CTYPE" on CodeBuild by default, so we need install a language pack additionally. Locale is difficult to change on Windows. Python can handle encodings without an additional library.

CLI version used

2.7.18

Environment details (OS name and version, etc.)

Windows 10

tim-finnigan commented 2 years ago

Hi @kidotaka thanks for reaching out. Have you tried setting your locale as LC_ALL=en_US.UTF-8 to address this?

kidotaka commented 2 years ago

Hi @tim-finnigan I have not tried LC_ALL=en_US.UTF-8, but LC_ALL=en_US.UTF-8 is also available instead of LC_CTYPE on Linux. We can use en_US.UTF-8 without additional installation on CodeBuild's managed linux images. (locale -a shows en_US.UTF-8)

There are several workarounds, but if AWS_CLI_FILE_ENCODING is used for writing, there are no need to use workarounds. actual template file encoding AWS_CLI_FILE_ENCODING other conditions result (--output-template-file, temporary nested stack template) workaround
UTF-8 UTF-8 on Linux and LC_CTYPE=POSIX ascii. occasional encoding error LC_CTYPE or LC_ALL or PYTHONUTF8=1
UTF-8 UTF-8 on Windows and locale dependent encoding is not UTF-8 non UTF-8. occasional encoding error It's hard to change locale on Windows. PYTHONUTF8=1 can be used.
non UTF-8 non UTF-8 on Linux and LC_CTYPE=POSIX ascii. occasional encoding error LC_CTYPE or LC_ALL and additional language package installation is needed
non UTF-8 non UTF-8 on Windows and locale dependent encoding is not UTF-8 non UTF-8 no problem if encodings are same