Author: | sebwills |
---|---|
Assignees: | - |
Labels: | `area-System.Security` |
Milestone: | - |
Closed sebwills closed 1 year ago
Tagging subscribers to this area: @dotnet/area-system-security, @vcsjones See info in area-owners.md if you want to be subscribed.
Author: | sebwills |
---|---|
Assignees: | - |
Labels: | `area-System.Security` |
Milestone: | - |
Regression? No response
With a slight tweak to the repro, it's the same behavior on .NET Framework 4.8, i.e. if you change readBlockSize to also be 10 to match the expected output length, then this also throws an exception on .NET Framework. That's because CryptoStream never tried to read beyond the amount of data requested. The cited repro only repros on .NET 6 and not .NET Framework 4.8 because of the mentioned partial read change, where on .NET Framework because readBlockSize is larger than the resulting data, CryptoStream would aggressively try to fill the buffer. But if the buffer you give it is such that it doesn't discover it's at EOF, then it won't do the final transform.
I think this is a known issue that was previously reported here https://github.com/dotnet/runtime/issues/2042 and documented as a breaking change at https://learn.microsoft.com/en-us/dotnet/core/compatibility/3.0#cryptostreamdispose-transforms-final-block-only-when-writing.
The linked issue suggests that you can add something like this before accessing the Hash
property:
hasher.TransformFinalBlock(Array.Empty<byte>(), 0, 0);
Console.WriteLine($"Retrieved hash {hasher.Hash}");
Or the work around noted in the original issue.
Note that if all you are trying to do is hash a stream, you don't need to use CryptoStream
to do that. It could be as simple as:
using var hasher = SHA256.Create();
byte[] hash = hasher.ComputeHash(source);
Console.WriteLine($"Retrieved hash {Convert.ToBase64String(hasher.Hash)}");
Or in .NET 7, with a static one-shot:
var hash = SHA256.HashData(source);
Console.WriteLine($"Retrieved hash {Convert.ToBase64String(hash)}");
Thanks for the responses.
It's helpful to see this behaviour partly documented
In .NET Core 3.0 and later versions, Dispose no longer tries to transform the final block when reading, which allows for incomplete reads.
I say partly because it's arguable that the repro I cited is not exactly an "incomplete read" since the whole stream has been read. I would still ideally like to see the CryptoStream docs themselves acknowledge the fact that a reader must attempt to read beyond EOF (getting a zero-byte result) if they want TransformFinalBlock to be called.
Thanks @vcsjones for the hashing hints. I used HashAlgorithm just for the repro. The original issue I discovered was with a chain of CryptoStreams doing decryption, hashing and recording complete/incomplete decrypts, which crashed one in 16 times on .net6 (i.e. when processing a stream that happened to have a length that was a multiple of the 16 block size in use).
How about changing CryptoStream
so that, on Close, if it hasn't already hit EOF, it attempts another Read
, which will result in a call to TransformFinalBlock
if it gets a zero length back. Nobody can really complain about the side effect of potentially reading extra bytes from the source stream which are then discarded, since CryptoStream routinely reads ahead. This would make the behaviour more intuitive, and not have the quirky dependence on input stream length which can result in bugs not detected in testing.
How about changing CryptoStream so that, on Close, if it hasn't already hit EOF, it attempts another Read, which will result in a call to TransformFinalBlock if it gets a zero length back.
This suggestion only really makes sense for degenerate transformations like a hash algorithm. For block-padded encryption/decryption (the 99% use case) the call to Read which signals EOF from the underlying stream will either a) (decrypt) check that the still-buffered data is a multiple of the block size (throwing if not), decrypt it, and remove the padding that was added during decryption (throwing if it's invalid, and leaving it if it's an un-removable algorithm); or b) (encrypt) apply the selected padding algorithm and encrypt the final block. That is, the expected case of TransformFinalBlock is that it does work and produces more bytes. The ICryptoTransform instances from HashAlgorithm instances are degenerate in that they don't. They're further degenerate in that you're expected to look at side-effect state on the instance afterwards (the Hash property), which is not anything done by any other platform-provided ICryptoTransform.
We... could... add something like if (stream.CanSeek && stream.Position == stream.Length)
as a followon check in Read(), but that doesn't really feel valuable to me because it's special-casing just for HashAlgorithm and, of course, doesn't help non-seekable streams that happen to know their length.
My recommendation here would be to stop using CryptoStream altogether, and just use the original stream in conjunction with IncrementalHash
.
Description
At the end of the source stream, a
CryptoStream
in Read mode is expected to call theTransformFinalBlock
method of the suppliedICryptoTransform
to handle any partial final block and allow theICryptoTransform
do any final action required (e.g. produce a hash value).If the code consuming the
CryptoStream
knows how many bytes to expect in the 'decrypted' stream (e.g. if the format of the data indicates its length), it may stop reading and dispose theCryptoStream
once it has read all the bytes in the stream, without ever doing aRead
operation that returns a zero count. For example, some GZip stream readers do this.Please note, I am not just referring to
Read
returning less than itscount
parameter (which I know is normal in dotnet 6), I am talking about the total number of bytes read reaching the known total length of the stream.In this scenario,
CryptoStream
's behaviour depends on whether the source stream length is a multiple of theICryptoTransform
'sInputBlockSize
:ICryptoTransform.InputBlockSize
,CryptoStream
will have internally tried to read beyond the end of the encryptedStream, i.e. will have peformed a Read that returned zero, and will have calledTransformFinalBlock
on the partial buffer. This is as expected.if the stream length is a multiple of
ICryptoTransform
.InputBlockSize
,CryptoStream
will have an empty internal buffer and will not know that it has reached the end of the encrypted stream, and will not callTransformFinalBlock
, even when it is closed. As a result theICryptoTransform
does not know it has seen all the data and cannot do any final operations such as verifying the stream's integrity or making a hash value available.This feels like unexpected behaviour - particularly the way the behaviour depends on whether or not the input stream length happens to be a multiple of the transform's buffer size.
Reproduction Steps
Here is a repro, using HashAlgorithm as the
ICryptoTransform
, which will throw if you try to read the hash value without callingTransformFinalBlock
. Because HashAlgorithm's input block size is 1, this repro works with any length stream.Expected behavior
On Closing a
CryptoStream
inRead
mode, when all the output stream's bytes have been read, but withoutRead
having returned zero, it should invokeTransformFinalBlock
on theICryptoTransform
it was constructed with.In terms of the above repro code, the code should not throw (but the fault is with
CryptoStream
, notHashAlgorithm
).If the behaviour demonstrated here is felt to be acceptable, perhaps the
CryptoStream
documentation should highlight that it is essential to read the entire input stream, including a final read that returns zero count, in order to correctly interact with the suppliedICryptoTransform
.Actual behavior
Running the above repro produces the output
i.e.
HashAlgorithm
throws when we access Hash because itsTransformFinalBlock
method was never called by theCryptoStream
.Regression?
No response
Known Workarounds
The code consuming the CryptoStream's output can make a further call to the
CryptoStream
'sRead
method (even with thecount
parameter zero), even though it knows the result will be zero. This will triggerCryptoStream
to call TransformFinalBlock (with zero length buffer), as it should.Configuration
dotnet 6.0.401 Seen on Mac and Windows, but not known to be architecture specific.
Other information
No response