masterzen / winrm

Command-line tool and library for Windows remote command execution in Go
Apache License 2.0
424 stars 129 forks source link

Use proper UTF-8 to UTF-16 conversion? #96

Open rgl opened 5 years ago

rgl commented 5 years ago

Can https://github.com/masterzen/winrm/blob/1d17eaf15943ca3554cdebb3b1b10aaa543a0b7e/powershell.go#L10-L23 be changed to use a proper UTF-8 to UTF-16 (the native windows encoding) conversion?

masterzen commented 5 years ago

@rgl,

I'm not really versed into Windows encoding (nor UTF-16). From what I understand this code is building a pure UCS-2 (wide char) string with the topmost byte being always 0. This will indeed fail with any character > 127, which is unfortunate.

I think this can be fixed with this:

 wideCmd := utf16.Encode([]rune(psCmd))

Hopefully the result will be in proper endian for the receiving machine.

Would you mind testing this, as I'm very illiterate about everything related to powershell ?

rgl commented 5 years ago

Windows uses UTF-16LE and utf16.Encode is UTF-16BE. I will submit PR soon.

rgl commented 5 years ago

Oh an I was mistaken, utf16.Encode is really UTF-16LE! We just need to convert the result into a []byte.

masterzen commented 5 years ago

@rgl I'm lost, your PR implements a BE->LE conversion, but your last comment here seems to imply it wasn't needed. Can you elaborate?

rgl commented 5 years ago

Sorry for the confusion... the PR does not really convert from BE to LE.

Let me clarify:

  1. utf16.Encode converts from string to a []uint16 encoded as UTF-16LE.
  2. encodeUtf16Le converts from []uint16 to a []byte encoded as UTF-16LE, making the entire conversion from string to []byte encoded as UTF-16LE.