dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.41k stars 4.76k forks source link

Inconsistent DeflateStream output between .net8 and .net9 #110041

Closed droosma closed 21 hours ago

droosma commented 21 hours ago

Description

We have a unit test validating the output of a Deflate method. When I switched from .net8 to .net9 these broke.

Reproduction Steps

using System.IO.Compression;
using System.Text;

const string input = "test input";
const string expected = "K2FIZShmKGFQYMhkyGMoYCgFsgE=";

var bytes = Encoding.Unicode.GetBytes(input);
using var ms = new MemoryStream();
using (var ds = new DeflateStream(ms, CompressionMode.Compress))
{
    ds.Write(bytes, 0, bytes.Length);
}

var msArray = ms.ToArray();
var output = Convert.ToBase64String(msArray);

if(expected != output)
{
    throw new Exception($"Expected: {expected}, Output: {output}");
}

Console.WriteLine("Success");
Console.ReadLine();

Expected behavior

I expected the output to remain consistent

Actual behavior

The output changed from K2FIZShmKGFQYMhkyGMoYCgFsgE= to K2FIZShmKGFQYMhkyGMoYChlKGEAAA==

Regression?

Yes, this worked throughout a couple of .net versions

Known Workarounds

No response

Configuration

.net version: 9.0.100 Os: Windows / Linux Architecture: x64

Other information

No response

dotnet-policy-service[bot] commented 21 hours ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

huoyaoyuan commented 21 hours ago

The result of compression isn't deterministic. It changes when the version of compression library updates. Instead, you should validate the result is round-trippable, presumably with different compression and decompression ways.

droosma commented 21 hours ago

The problem is that we are using this as a space-saving mechanism throughout messages and databases, so things that are historically deflated will not be able to inflate when we migrate .net9 or is my assumption wrong in this?

Also this test has been passing throughout a number of .net versions and package updates I just checked the history, and the last time this code was touches was 2021-12-07 when updating to .net6 we are now at .net8

huoyaoyuan commented 21 hours ago

so things that are historically deflated will not be able to inflate when we migrate .net9 or is my assumption wrong in this?

Compression and decompression is not a 1:1 match. Different compressed bytes will decompress to the same bytes:

using System.IO.Compression;

byte[] data1 = Convert.FromBase64String("K2FIZShmKGFQYMhkyGMoYCgFsgE=");
byte[] data2 = Convert.FromBase64String("K2FIZShmKGFQYMhkyGMoYChlKGEAAA==");

var deflateStream1 = new DeflateStream(new MemoryStream(data1), CompressionMode.Decompress);
var deflateStream2 = new DeflateStream(new MemoryStream(data2), CompressionMode.Decompress);

Memory<byte> ReadToEnd(Stream stream)
{
    int bytesRead;
    byte[] buffer = new byte[4096];
    int pos = 0;
    while ((bytesRead = stream.Read(buffer.AsSpan(pos))) != 0)
    {
        pos += bytesRead;
        if (pos == buffer.Length)
        {
            byte[] newBuffer = new byte[buffer.Length * 2];
            buffer.CopyTo(newBuffer, 0);
            buffer = newBuffer;
        }
    }
    return buffer.AsMemory(0, pos);
}

Console.WriteLine(ReadToEnd(deflateStream1).Span.SequenceEqual(ReadToEnd(deflateStream2).Span));

The snippet prints True.

droosma commented 21 hours ago

I just verified that indeed I was on the wrong track, inflating old records still works fine, I have adjusted the test thank you very much for you time and feedback