hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io
Other
42.77k stars 9.56k forks source link

Terraform breaks other programs within the same console on Windows #33796

Open TheCloudlessSky opened 1 year ago

TheCloudlessSky commented 1 year ago

Terraform Version

Terraform v1.5.4
on windows_amd64

Terraform Configuration Files

N/A. Simply running the CLI will cause this issue.

Debug Output

N/A. Simply running the CLI will cause this issue.

Expected Behavior

Running the terraform CLI shouldn't break other programs in the same console on Windows.

Actual Behavior

There was work done in https://github.com/hashicorp/terraform/issues/18242#issuecomment-759846280 about 2.5 years ago to re-do how Terraform handles reading/writing to the console on Windows so that things like colors would work with Unicode characters, etc.

We recently upgraded our CLI to 1.5.4 -- a version that includes the previously mentioned fixes and our deployment pipeline is now failing. Our deployment pipeline builds our application and also runs Terraform to initialize, validate, plan, and apply changes.

We have a part of our application that launches a separate process, writes to its standard input, and reads back the standard output. We have tests that confirm this always works. Since upgrading the CLI, our tests for launching this process are now failing. We eventually tracked it down that the process was not processing the standard input we were sending. We couldn't reproduce this when running our tests in Visual Studio, couldn't reproduce when we ran just our tests via console runners, but could reproduce this when doing our full deployment. And once our deployment failed, we couldn't get our tests to run via console runners again -- everything about the console seemed broken!

The crux of this problem is that Terraform is changing the code page of the console to UTF-8 in an attempt to use a single change to have a large impact. This is a bad idea on Windows because:

  1. It's changing the user's console, not just a local change to the Terraform process.
  2. UTF-8 code page is broken on Windows, which causes a problem of affecting other processes after the Terraform CLI has exited. The code page shouldn't need to be set to input/output Unicode in a console. See this StackOverflow post.

For example, opening a new PowerShell console and running these commands:

$ chcp
Active code page: 437

$ terraform --version
Terraform v1.5.4
on windows_amd64

Your version of Terraform is out of date! The latest version
is 1.5.6. You can update by downloading from https://www.terraform.io/downloads.html

$ chcp
Active code page: 65001

The default code page for me on Windows using the English United States locale is CP-437. You can see that Terraform changes it to 65001 (UTF-8).

This can affect other processes launched from the console after Terraform has run. For example:

Example C program:

target.c

#include <windows.h>
#include <stdio.h>
#include <fcntl.h>
#include <io.h>
#include <stdint.h>

int main(int argc, char *argv[])
{
    // Set "stdin" to binary mode.
    int modeResult = _setmode(_fileno(stdin), _O_BINARY);
    if (modeResult == -1)
    {
        return NULL;
    }

    FILE *f = stdin;

    int i;

    if (f == NULL)
    {
        fprintf(stderr, "Failed to set stdin mode.\n");
        return -1;
    }

    printf("Successfully set stdin to binary mode\n");

    char header1 = getc(f);
    char header2 = getc(f);
    char header3 = getc(f);

    if (
        header1 != 'A' ||
        header2 != 'B' ||
        header3 != 'C')
    {
        fprintf(stderr, "Invalid file header: ");
        fprintf(stderr, "(decimal=%d, hex=%x, char=%c)", header1, header1, header1);
        fprintf(stderr, "(decimal=%d, hex=%x, char=%c)", header2, header2, header2);
        fprintf(stderr, "(decimal=%d, hex=%x, char=%c)", header3, header3, header3);
        fprintf(stderr, ".\n");
        return -1;
    }

    printf("The file header is correct! :)");
    return 0;
}

NOTE: We actually use third-party binaries in practice, but this is a small example of how to reproduce the bug.

Compile the above:

$ cl ./target.c

Example C# source program:

Source.cs

internal class Program
{
    public static void Main(string[] args)
    {
        var targetExePath = args[0];

        var sourceData = Encoding.UTF8.GetBytes("ABC");

        var processStartInfo = new ProcessStartInfo(targetExePath);

        processStartInfo.RedirectStandardInput = true;
        processStartInfo.RedirectStandardOutput = true;
        processStartInfo.RedirectStandardError = true;
        processStartInfo.UseShellExecute = false;
        processStartInfo.CreateNoWindow = true;

        using (var process = new Process())
        {
            process.StartInfo = processStartInfo;

            process.Start();

            Console.WriteLine($"Standard input code page = {process.StandardInput.Encoding.CodePage}");

            process.StandardInput.BaseStream.Write(sourceData, 0, sourceData.Length);

            process.StandardInput.Close();

            var stdout = process.StandardOutput.ReadToEnd();
            var stderr = process.StandardError.ReadToEnd();

            var hasExited = process.HasExited;

            if (hasExited)
            {
                var exitCode = process.ExitCode;
                if (exitCode != 0)
                {
                    Console.WriteLine($"Failed to exit successfully (ExitCode={process.ExitCode}).\nStandard Error:\n{stderr}");
                }
            }
            else
            {
                Console.WriteLine("Failed to cleanly exit after writing to stdin (manually killing).");
                process.Kill();
            }

            Console.WriteLine($"Standard Output:\n{stdout}");
        }
    }
}

Compile the above using .NET Framework 4.8 (not .NET Core/.NET 6, I believe this issue is worked around).

Test:

$ chcp 437

$ Source.exe target.exe
Standard input code page = 437
Standard Output:
Successfully set stdin to binary mode
The file header is correct! :)

$ terraform --version
Terraform v1.5.4
on windows_amd64

Your version of Terraform is out of date! The latest version
is 1.5.6. You can update by downloading from https://www.terraform.io/downloads.html

$ chcp
Active code page: 65001

$ Source.exe target.exe
Standard input code page = 65001
Failed to exit successfully (ExitCode=-1).
Standard Error:
Invalid file header: (decimal=-17, hex=ffffffef, char=�)(decimal=-69, hex=ffffffbb, char=�)(decimal=-65, hex=ffffffbf, char=�).

Standard Output:
Successfully set stdin to binary mode

This has the same output on Windows 10 and Windows 11.

I personally believe that it shouldn't be Terraform's responsibility to change configuration about the current console, especially the code page (as I mentioned, it's typically misinformed/bad advice). The gist for a fix is that Terraform should most likely be using Window's Console APIs instead of File APIs for reading/writing text to the console. Here's how Python's CLI handles this: https://github.com/python/cpython/blob/5141b1ebe07ad54279e0770b4704eaf76f24951d/Modules/_io/winconsoleio.c

As a temporary work around/fast fix, Terraform could set the code page and then reset it once the CLI exits?

Steps to Reproduce

  1. chcp 437
  2. terraform --version
  3. chcp -- prints 65001

Additional Context

No response

References

No response

crw commented 1 year ago

Thanks for this feature request! The level of detail and code inspection is very appreciated.

If you are viewing this issue and would like to indicate your interest, please use the 👍 reaction on the issue description to upvote this issue. We also welcome additional use case descriptions. Thanks again!

kmoe commented 1 year ago

@TheCloudlessSky What version of PowerShell are you using?

TheCloudlessSky commented 1 year ago

@kmoe v5 but I've also confirmed just with cmd and PowerShell 7.3.

apparentlymart commented 11 months ago

Unfortunately it seems I didn't retain good enough notes about everything I was referring to when making the change that caused this problem, but we made these changes in response to recommendations from Microsoft and so expected that they were good advice. I'm sorry that turns out not to be true. :confounded:


One reference I do still have a link to from my notes is Classic Console APIs versus Virtual Terminal Sequences. That document is primarily concerned with the other big thing we changed at the same time -- activating virtual terminal processing instead of using the legacy console API -- but it does still briefly touch on UTF-8 support:

Unicode

UTF-8 is the accepted encoding for Unicode data across almost all modern platforms, as it strikes the right balance between portability, storage size and processing time. However, Windows historically chose UTF-16 as its primary encoding for Unicode data. Support for UTF-8 is increasing in Windows and use of these Unicode formats does not preclude the usage of other encodings.

The Windows Console platform has supported and will continue to support all existing code pages and encodings. Use UTF-16 for maximum compatibility across Windows versions and perform algorithmic translation with UTF-8 if necessary. Increased support of UTF-8 is in progress for the console system.

UTF-16 support in the console can be utilized with no additional configuration via the W variant of all console APIs and is a more likely choice for applications already well versed in UTF-16 through communication with the wchar_t and W variant of other Microsoft and Windows platform functions and products.

UTF-8 support in the console can be utilized via the A variant of Console APIs against console handles after setting the codepage to 65001 or CP_UTF8 with the SetConsoleOutputCP and SetConsoleCP methods, as appropriate. Setting the code pages in advance is only necessary if the machine has not chosen "Use Unicode UTF-8 for worldwide language support" in the settings for Non-Unicode applications in the Region section of the Control Panel.

For this section in particular, I will concede that we don't seem to be following this recommendation fully, although I'm also not sure this section is making a single clear recommendation, as opposed to just gesturing vaguely at a handful of different options.

Specifically, we are using SetConsoleOutputCP to change the console's "legacy codepage", but I believe all subsequent writes to the console are effectively using the W variants of the file I/O APIs, although we're doing that only indirectly through the Windows implementation of Go's os package.

Based on what I remember about these parts of the Windows API (my memory is spotty, since it's been a long while since I did real Windows API dev), only the A variants of the Win32 API functions are concerned with legacy codepages, and so it's quite possible that we don't actually need to change the legacy code page of the console now that modern Windows versions have a fully-unicode-aware text buffer. Older versions of Windows -- pre-Windows 10 -- could only retain in the console buffer characters from the currently-selected legacy codepage, but this documentation suggests that was fixed as part of all of the recent console modernization work.

I think the best next step here then would be to try removing the SetConsoleOutputCP call altogether and then encourage Terraform to write some astral plane characters like emoji to its output, and see if my hunch is correct that the modernized text buffer will retain the emoji characters even with the console codepage set to 437.

The following should test it, I think:

  1. Comment out the call to change the codepage in the Terraform code (linked from the initial issue comment above) and recompile.
  2. Make sure you are running in a console with its legacy codepage set to 437, by running chcp.
  3. Write a Terraform configuration that includes an intentionally-failing check, or precondition, or similar whose error_message includes an emoji character or other astral plane character.
  4. Run terraform plan and inspect the rendered diagnostic. If the astral plane character appears correctly in the terminal, this test is successful and therefore we should be able to safely remove the SetConsoleOutputCP call.

I'd note that this will probably still cause Terraform to leave the terminal in Virtual Terminal Processing mode after it exits, which is not ideal, but hopefully still less troublesome than changing the console's legacy codepage.

I don't have a functioning Windows system handy to test this on right now, so I can't try this myself immediately, but hopefully the above is useful to someone who picks up this issue to work on in the future.