FiloSottile / age

A simple, modern and secure encryption tool (and Go library) with small explicit keys, no config options, and UNIX-style composability.
https://age-encryption.org
BSD 3-Clause "New" or "Revised" License
15.95k stars 482 forks source link

Windows PowerShell 5.1 output file redirection corrupts files #290

Open namitgpta opened 3 years ago

namitgpta commented 3 years ago

Environment

What were you trying to do

I was trying to encrypt a g1.txt file containing text passwords with help of a passphrase. It got successfully encrypted with filename g1.txt.age. But when I tried to decrypt it, it is showing this error (see screenshot).

age_error

What happened

Actually, the g1.txt file, which I am trying to encrypt, is already an encrypted file of passwords. What I did is: First, I encrypted my original txt file of passwords with help of AES pbkdf2 python program, which generated an encrypted file g1.txt. Then for further security, I thought that I should implement Age Encryption on the encrypted file itself (for double security). But after encryption, during decryption, it shows this error in windows terminal:

age: error: failed to read header: parsing age header: unexpected intro: "\xff\xfea\xoog\x00e\x00-\x00e\x00n\x00c\x00r\x
oOy\x00p\x00t\x00i\x000\x00n\x00.\x000\x00r\xO0g\x00/\x00v\x001\x00\r\x00\n"
age: report unexpected or unhelpful errors at https://filippo.io/age/report

So, I think the Age algorithm is unable to encrypt data which is already encrypted.

str4d commented 3 years ago

This is actually a Windows problem. Windows uses UTF-16 instead of UTF-8 as its base format, so when encoding ASCII strings as bytes you end up with extra bytes. If you look carefully at the store message, you can see it is age-encryption.org/v1 with every character separated by a null byte.

The interim solution here is to use armored output (the -a flag) when encrypting, which serializes the encrypted output as text instead of bytes.

However, there is a real bug here: age should be encoding that format version string explicitly as UTF-8, regardless of platform.

namitgpta commented 3 years ago

I tried encryption with -a flag, but no success, still same error:

age_error2

This is the encrypted file screenshot:

3
str4d commented 3 years ago

That narrows down the location of the bug, thanks!

str4d commented 3 years ago

The header line in question is written here:

https://github.com/FiloSottile/age/blob/4ea591b25fba86340e06c9b9b10095bfb1a496b8/internal/format/format.go#L140

intro is a const string literal, and AFAICT Go uses UTF-8 for all strings. So the bug must be somewhere inside io.WriteString.

Are you using a binary build of age? If so, where did you install it from? If you built from source, what Go compiler (and other settings) did you use?

namitgpta commented 3 years ago

I simply downloaded the "age-v1.0.0-rc.3-windows-amd64.zip" file from pre-built binaries page on your Github repo. Nothing else I did.

JeffreyVdb commented 3 years ago

Facing the same issue with v1.0.0-rc.3 My binary is installed using scoop

FiloSottile commented 3 years ago

This came up before, and as far as I know, this is a fundamental issue of PowerShell pipes and file redirection, and the way to fix it is to use | Out-File -Encoding UTF8NoBOM (which is a mouthful). Is rage handling this better? Is there a way we can tell PowerShell we are putting binary on stdout and ask pretty please not to UTF-16 it? Failing that, can we detect it to print a warning?

str4d commented 3 years ago

According to the about_Redirection notes (link is for 7.1, but the comment is also present in the docs for 5.1, 7.0, and 7.2):

When you are writing to files, the redirection operators use UTF8NoBOM encoding. If the file has a different encoding, the output might not be formatted correctly. To write to files with a different encoding, use the Out-File cmdlet with its Encoding parameter.

That seems like the opposite of what @FiloSottile was seeing? If so, it suggests this is a bug in PowerShell that might have been fixed, but for which we need to handle earlier versions (ugh).

FiloSottile commented 3 years ago

IIUC, the behavior changed with PowerShell Core 6.0. Apparently, PowerShell Core is NOT the Windows PowerShell that comes installed with Windows 10, which is still version 5.1. It sounds like PowerShell 7 is meant to eventually replace WIndows PowerShell but it hasn't yet.

https://docs.microsoft.com/en-us/powershell/scripting/whats-new/what-s-new-in-powershell-core-60?view=powershell-7.1&viewFallbackFrom=powershell-6#default-encoding-is-utf-8-without-a-bom-except-for-new-modulemanifest

https://devblogs.microsoft.com/powershell/powershell-7-road-map/

FiloSottile commented 3 years ago

Previously: #2, #127

namitgpta commented 3 years ago

Still not working on Powershell 7.2 image

Its not even working in CMD, same error as in above screenshot.

FiloSottile commented 3 years ago

Ah, looks like in PowerShell 6+ the issue is not UTF-16 anymore, but the CRLF line endings. Again, we need to look for a way to say "this is binary, leave it alone".

With PowerShell 6+, it should work with the -a armor flag, because that tolerates CRLF.

namitgpta commented 3 years ago

In Powershell 6+, if we use -a armor flag during encryption, then after decryption the data becomes corrupted. Unable to retrieve the original data.

FiloSottile commented 3 years ago

@namitgpta that's unexpected, can you paste the command sequence and the corruption?

JeffreyVdb commented 3 years ago

This sums it up well: https://github.com/binref/refinery/issues/5

I also wrote a snippet that writes the uint64 binary representation of the numbers 0 to 9 to a file and to stdout to compare the difference: https://play.golang.org/p/hCcSRcxGQ67

image

When writing to stdout, the newlines are converted to crlf (0d 0a). Writing to a file does exactly what you'd expect. It leaves the header intact, and it doesn't end with a trailing newline.

That does mean that this is not easy to fix.

cameronkerrnz commented 3 years ago

The short version (having recently battled against PowerShell and Windows terminal subsystem) is that its best if the program (age) can write the file directly.

It (really does) help to understand that whereas Unix systems treat content going through a pipe as bytestreams (or at least as character streams of usually 8-bit bytes with no regard to encoding), Windows (in its own also very long established lineage) treats them as 'text' streams, which for PowerShell and its modern terminal-subsystem means UTF-16 LittleEndian text streams with all the painful cross-platforming phun that can imply. (Arguably though, that's not correct since what flows across a '|' in PowerShell is really an object, but ...)

So any documentation (for Windows at least) would be provide by using the -o flag. Probably deserves a section on the main page.

Windows 10 (I'm on the latest; 20H1) with PowerShell will not provide a happy path when using redirections; the easiest way of using redirections is to redirect a textual output; then open it in an editor to change the encoding. That's about as close to a happy-path as you're likely to get at least until you can do ... | Out-File -Encoding UTF8NoBOM -Filename blah.txt

Presumably should shouldn't be a problem if using WSL-2 or similar, but is sure is frustrating when trying to do DevOps kind of things in Windows (where I feel this kind of pain the most).

cadayton commented 3 years ago

https://github.com/tmclnk/RepoCrypto/blob/master/FileCryptography.psm1 This only works in non-core version of PowerShell and it appears to just be rewriting the file.

As a work around would it be possible to support the encryption of a string of any size and return an encrypted string? The caller then would be responsible for dumping the encrypted string where ever they choose.

FiloSottile commented 2 years ago

This is a long-running issue affecting all versions of PowerShell (PowerShell/PowerShell#1908).

Here are the results of running a small test program (https://go.dev/play/p/jlY_sKptwpP) on Windows 10.

cmd.exe ``` filippo@MEGABRANTIS C:\Users\Filippo>.\winshellcheck.exe w f filippo@MEGABRANTIS C:\Users\Filippo>.\winshellcheck.exe r f Valid filippo@MEGABRANTIS C:\Users\Filippo>.\winshellcheck.exe w | .\winshellcheck.exe r Valid filippo@MEGABRANTIS C:\Users\Filippo>.\winshellcheck.exe w > f filippo@MEGABRANTIS C:\Users\Filippo>.\winshellcheck.exe r f Valid ``` Windows Powershell 5.1.19041.1320 ``` PS C:\Users\Filippo> .\winshellcheck.exe w f; .\winshellcheck.exe r f Valid PS C:\Users\Filippo> .\winshellcheck.exe w | .\winshellcheck.exe r LF -> CRLF corruption Other corruption PS C:\Users\Filippo> .\winshellcheck.exe w > f; .\winshellcheck.exe r f Other corruption ``` Ubuntu 20.04 LTS bash ``` filippo@Megabrantis:/mnt/c/Users/Filippo$ ./winshellcheck.exe w f; ./winshellcheck.exe r f Valid filippo@Megabrantis:/mnt/c/Users/Filippo$ ./winshellcheck.exe w | ./winshellcheck.exe r Valid filippo@Megabrantis:/mnt/c/Users/Filippo$ ./winshellcheck.exe w > f; ./winshellcheck.exe r f Valid ``` PowerShell 7.1.3 ``` PS C:\Users\Filippo> .\winshellcheck.exe w f; .\winshellcheck.exe r f Valid PS C:\Users\Filippo> .\winshellcheck.exe w | .\winshellcheck.exe r LF -> CRLF corruption Other corruption PS C:\Users\Filippo> .\winshellcheck.exe w > f; .\winshellcheck.exe r f LF -> CRLF corruption Other corruption ```

It doesn't look like there is any solution, besides using -o or -a.

I would very much like to find a way to detect when age is running under PS (but not cmd.exe or WSL), and print a warning if neither -o or -a are in use. Anyone?

str4d commented 2 years ago

I don't know about detecting PS or cmd.exe, but you can distinguish WSL from regular Linux by checking if /proc/sys/kernel/osrelease contains the strings Microsoft or WSL. See https://crates.io/crates/wsl for the Rust crate I use for this in rage.

cameronkerrnz commented 2 years ago

Comparing the environment I get in cmd versus powershell, there doesn't appear to be any reasonable difference (eg. PSModulePath is set for me in both), so looking at the environment doesn't appear useful.

thepudds commented 2 years ago

I took a brief look at this. If you want to print a warning when stdout is a pipe on windows, you could try something like:

    h, _ := windows.GetStdHandle(windows.STD_OUTPUT_HANDLE)

    fileType, _ := windows.GetFileType(h)

    if fileType == windows.FILE_TYPE_PIPE {
        fmt.Fprintf(os.Stderr, "age: some warning about pipes on windows. consider -o or -a\n")
    }

In some basic testing with PowerShell 5.1.19041.1320:

Is that too broad of a net to cast, or is it reasonable to have some properly phrased hint emitted on Windows if stdout is a pipe?

Small test program: https://go.dev/play/p/YSUOeIe8VnP

FiloSottile commented 2 years ago

@thepudds, hi! That looks very promising. Is it FILE_TYPE_PIPE for | redirection under cmd.exe and/or WSL, though? I bet there's some other attributes we can use to tell them apart.

thepudds commented 6 months ago

@thepudds, hi! That looks very promising. Is it FILE_TYPE_PIPE for | redirection under cmd.exe and/or WSL, though? I bet there's some other attributes we can use to tell them apart.

Hi @FiloSottile, late reply, but FWIW, I did poke at it some more shortly after posting here, but neglected to circle back to report results. In short, by sniffing various aspects, I was able to distinguish between PowerShell vs. a normal command prompt, but was not able to simultaneously distinguish vs. a Cygwin terminal, which seemed to be a fatal-ish flaw (I think -- it's now been a while since I looked at it, so I might be misremembering the exact details).

The (hopefully?) better news is there is new feature of PowerShell that greatly reduces the mangling that PowerShell does of pipes and redirects for native commands (such as Go binaries):

https://github.com/PowerShell/PowerShell/pull/17857

https://learn.microsoft.com/en-us/powershell/scripting/learn/experimental-features?view=powershell-7.4#psnativecommandpreservebytepipe

I have some hope that this helps here, but I have not confirmed with age. I did just seem to briefly confirm with go test though, so posting a quick comment here in case it is useful (and before too much time goes by 😅 ).