golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.18k stars 17.57k forks source link

os/exec: automatic encoding conversion for stdout/stderr on Windows #69709

Open lime2008 opened 9 hours ago

lime2008 commented 9 hours ago

Proposal: os/exec: Handle Windows Standard Streams Encoding

Currently, the os/exec package does not differentiate between the default behavior of different Windows versions regarding standard output and error encoding. This can lead to encoding issues when running commands that output non-UTF-8 characters in Windows with the "Use Unicode UTF-8 for worldwide language support" beta feature enabled.

Problem:

Proposed Solution:

Introduce a mechanism in os/exec to handle Windows console encoding variations, specifically:

  1. Detection: Automatically detect if the Windows UTF-8 beta feature is active.
  2. Transparent Handling: Based on the chosen configuration, seamlessly handle encoding and decoding of output streams within os/exec.

Benefits:

This feature request aims to improve the reliability and user-friendliness of os/exec when interacting with console applications in diverse Windows environments. Thank you to everyone who reviews this request.

qmuntal commented 9 hours ago

Thanks for reporting this issue @lime2008. I'm moving this out of the proposal process given that you are suggesting that it can be fixed without adding new APIs. Let's treat this a bug fix rather than a proposal for now.

Could you provide some more detailed steps for reproducing this issue on my PC?

lime2008 commented 8 hours ago

Below is a script that demonstrates the problem:

package main

import (
    "fmt"
    "log"
    "os/exec"
)

func main() {
    cmdRunner := exec.Command("cmd", "/C", "echo 中文字符测试 any chinese character test")
    output, err := cmdRunner.Output()
    if err != nil {
        log.Printf("Error executing command: %s", err)
        output = []byte(fmt.Sprintf("Error: %s", err.Error()))
    }
    //utf8Result, err := EnsureUTF8(output)
    utf8Result := output
    fmt.Print(string(utf8Result))

}

and it will make the output like: image

lime2008 commented 8 hours ago

after enabling the settings(sorry my system interface is in Chinese.) it's in Settings -> Time & language -> Language & region

1122cedb0dc7d2e54d36f933d89e75b8

and then it will output normal characters like

2da3e6ee4baea04b82b8ca0b265c546c

Thanks for the review.

seankhliao commented 7 hours ago

I'm concerned that os/exec currently treats output as a stream of bytes, not necessarily text that has an encoding that needs to be translated. What happens when transparent handling tries to convert binary data?

lime2008 commented 7 hours ago

How about add an interface to explicitly convert []byte to str based on the system character set? Or give a param to decide convert or not Consider in python there is also a mode r to auto decode using system config and rb for raw bytes.