dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.13k stars 4.7k forks source link

GZipStream decompression not working properly. #55905

Open NavjotSinghMinhas opened 3 years ago

NavjotSinghMinhas commented 3 years ago

Describe the bug

GZipStream decompression giving System.IO.InvalidDataException even when data is correct. Note: The below code works fine on .NET Framework 4.7.2

To Reproduce

public static void Main(string[] args)
{
    string byteString = "120,218,133,83,203,106,27,65,16,252,21,49,103,105,232,233,121,235,158,75,192,33,144,163,9,102,165,93,9,131,223,146,15,177,208,191,123,170,118,101,180,129,144,147,186,182,122,166,171,186,70,39,243,54,28,94,158,159,14,131,89,159,76,223,29,59,252,118,98,214,198,73,86,27,197,44,77,231,230,80,9,93,178,62,2,250,11,155,201,134,57,140,13,214,36,182,178,55,53,148,188,183,66,46,163,85,75,182,74,88,26,44,177,76,168,242,30,13,227,208,77,147,116,123,50,31,114,249,202,27,62,160,44,85,84,16,21,205,249,119,107,117,127,181,134,120,105,245,95,173,58,182,242,84,169,42,94,49,132,86,156,47,227,237,27,88,241,94,173,39,138,188,207,103,91,10,32,172,172,194,164,118,147,137,108,192,128,77,153,81,245,138,218,10,23,96,35,36,109,169,222,38,135,90,41,207,141,110,183,16,146,156,45,108,131,140,213,10,85,252,170,184,72,231,162,90,133,242,45,87,137,2,179,181,74,148,12,132,217,41,228,156,83,67,189,76,77,61,3,213,36,66,159,61,102,203,232,185,247,87,117,184,170,233,190,46,190,191,63,44,84,212,45,23,78,215,33,174,85,23,63,111,192,167,127,240,50,241,249,63,231,203,108,245,125,157,189,177,65,102,236,0,253,21,222,73,234,156,132,131,172,58,157,228,115,44,181,237,159,36,108,104,156,76,13,204,176,164,113,207,3,51,108,241,98,69,3,51,116,97,204,112,96,134,206,122,44,117,71,49,105,122,22,59,238,178,216,130,24,118,76,177,61,17,71,10,74,66,181,18,0,32,36,181,201,12,120,199,117,102,111,171,111,104,31,167,96,246,16,196,135,237,25,99,77,122,247,227,215,55,115,94,154,195,241,109,232,30,239,159,246,119,199,63,47,237,223,106,94,223,159,143,205,235,249,252,9,112,213,199,94,10";
    string[] byteStringSplit = byteString.Split(',');

    byte[] dataBytes = new byte[byteStringSplit.Length];
    for (int i = 0; i < byteStringSplit.Length; i++)
        dataBytes[i] = Convert.ToByte(byteStringSplit[i]);

    string result = new StreamReader(new GZipStream(new MemoryStream(dataBytes), CompressionMode.Decompress)).ReadToEnd();
}

Exceptions (if any)

System.IO.InvalidDataException with Stack Trace: at System.IO.Compression.Inflater.Inflate(FlushCode flushCode) in //src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 300 at System.IO.Compression.Inflater.ReadInflateOutput(Byte* bufPtr, Int32 length, FlushCode flushCode, Int32& bytesRead) in //src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 266 at System.IO.Compression.Inflater.ReadOutput(Byte bufPtr, Int32 length, Int32& bytesRead) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 122 at System.IO.Compression.Inflater.InflateVerified(Byte bufPtr, Int32 length) in //src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 92 at System.IO.Compression.DeflateStream.ReadCore(Span`1 buffer) in //src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/DeflateStream.cs:line 302 at System.IO.Compression.DeflateStream.Read(Byte[] array, Int32 offset, Int32 count) in //src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/DeflateStream.cs:line 240 at System.IO.Compression.GZipStream.Read(Byte[] array, Int32 offset, Int32 count) in //src/System.IO.Compression/src/System/IO/Compression/GZipStream.cs:line 84 at System.IO.StreamReader.ReadBuffer() in //src/System.Private.CoreLib/shared/System/IO/StreamReader.cs:line 594 at System.IO.StreamReader.ReadToEnd() in //src/System.Private.CoreLib/shared/System/IO/StreamReader.cs:line 415 at ConsoleApp9.Program.Main(String[] args) in C:\Users\Navjot Singh\source\repos\ConsoleApp9\ConsoleApp9\Program.cs:line 20

Further technical details

ghost commented 3 years ago

Tagging subscribers to this area: @dotnet/area-system-io-compression See info in area-owners.md if you want to be subscribed.

Issue Details
### Describe the bug GZipStream decompression giving System.IO.InvalidDataException even when data is correct. **Note:** The below code works fine on **.NET Framework 4.7.2** ### To Reproduce ``` public static void Main(string[] args) { string byteString = "120,218,133,83,203,106,27,65,16,252,21,49,103,105,232,233,121,235,158,75,192,33,144,163,9,102,165,93,9,131,223,146,15,177,208,191,123,170,118,101,180,129,144,147,186,182,122,166,171,186,70,39,243,54,28,94,158,159,14,131,89,159,76,223,29,59,252,118,98,214,198,73,86,27,197,44,77,231,230,80,9,93,178,62,2,250,11,155,201,134,57,140,13,214,36,182,178,55,53,148,188,183,66,46,163,85,75,182,74,88,26,44,177,76,168,242,30,13,227,208,77,147,116,123,50,31,114,249,202,27,62,160,44,85,84,16,21,205,249,119,107,117,127,181,134,120,105,245,95,173,58,182,242,84,169,42,94,49,132,86,156,47,227,237,27,88,241,94,173,39,138,188,207,103,91,10,32,172,172,194,164,118,147,137,108,192,128,77,153,81,245,138,218,10,23,96,35,36,109,169,222,38,135,90,41,207,141,110,183,16,146,156,45,108,131,140,213,10,85,252,170,184,72,231,162,90,133,242,45,87,137,2,179,181,74,148,12,132,217,41,228,156,83,67,189,76,77,61,3,213,36,66,159,61,102,203,232,185,247,87,117,184,170,233,190,46,190,191,63,44,84,212,45,23,78,215,33,174,85,23,63,111,192,167,127,240,50,241,249,63,231,203,108,245,125,157,189,177,65,102,236,0,253,21,222,73,234,156,132,131,172,58,157,228,115,44,181,237,159,36,108,104,156,76,13,204,176,164,113,207,3,51,108,241,98,69,3,51,116,97,204,112,96,134,206,122,44,117,71,49,105,122,22,59,238,178,216,130,24,118,76,177,61,17,71,10,74,66,181,18,0,32,36,181,201,12,120,199,117,102,111,171,111,104,31,167,96,246,16,196,135,237,25,99,77,122,247,227,215,55,115,94,154,195,241,109,232,30,239,159,246,119,199,63,47,237,223,106,94,223,159,143,205,235,249,252,9,112,213,199,94,10"; string[] byteStringSplit = byteString.Split(','); byte[] dataBytes = new byte[byteStringSplit.Length]; for (int i = 0; i < byteStringSplit.Length; i++) dataBytes[i] = Convert.ToByte(byteStringSplit[i]); string result = new StreamReader(new GZipStream(new MemoryStream(dataBytes), CompressionMode.Decompress)).ReadToEnd(); } ``` ### Exceptions (if any) System.IO.InvalidDataException with Stack Trace: at System.IO.Compression.Inflater.Inflate(FlushCode flushCode) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 300 at System.IO.Compression.Inflater.ReadInflateOutput(Byte* bufPtr, Int32 length, FlushCode flushCode, Int32& bytesRead) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 266 at System.IO.Compression.Inflater.ReadOutput(Byte* bufPtr, Int32 length, Int32& bytesRead) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 122 at System.IO.Compression.Inflater.InflateVerified(Byte* bufPtr, Int32 length) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/Inflater.cs:line 92 at System.IO.Compression.DeflateStream.ReadCore(Span`1 buffer) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/DeflateStream.cs:line 302 at System.IO.Compression.DeflateStream.Read(Byte[] array, Int32 offset, Int32 count) in /_/src/System.IO.Compression/src/System/IO/Compression/DeflateZLib/DeflateStream.cs:line 240 at System.IO.Compression.GZipStream.Read(Byte[] array, Int32 offset, Int32 count) in /_/src/System.IO.Compression/src/System/IO/Compression/GZipStream.cs:line 84 at System.IO.StreamReader.ReadBuffer() in /_/src/System.Private.CoreLib/shared/System/IO/StreamReader.cs:line 594 at System.IO.StreamReader.ReadToEnd() in /_/src/System.Private.CoreLib/shared/System/IO/StreamReader.cs:line 415 at ConsoleApp9.Program.Main(String[] args) in C:\Users\Navjot Singh\source\repos\ConsoleApp9\ConsoleApp9\Program.cs:line 20 ### Further technical details - ASP.NET Core version Issue produced in .NET 5 & .NET Core 3.1 / 3.0 / 2.2 / 2.1 / 2.0 - Include the output of `dotnet --info` .NET SDK (reflecting any global.json): Version: 5.0.302 Commit: c005824e35 Runtime Environment: OS Name: Windows OS Version: 10.0.19043 OS Platform: Windows RID: win10-x64 Base Path: C:\PROGRAM FILES\DOTNET\sdk\5.0.302\ Host (useful for support): Version: 5.0.8 Commit: 35964c9215 .NET SDKs installed: 2.1.700 [C:\PROGRAM FILES\DOTNET\sdk] 2.2.300 [C:\PROGRAM FILES\DOTNET\sdk] 3.0.103 [C:\PROGRAM FILES\DOTNET\sdk] 3.1.101 [C:\PROGRAM FILES\DOTNET\sdk] 3.1.117 [C:\PROGRAM FILES\DOTNET\sdk] 5.0.103 [C:\PROGRAM FILES\DOTNET\sdk] 5.0.104 [C:\PROGRAM FILES\DOTNET\sdk] 5.0.302 [C:\PROGRAM FILES\DOTNET\sdk] .NET runtimes installed: Microsoft.AspNetCore.All 2.1.11 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.1.25 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.1.28 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.2.5 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.All 2.2.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.All] Microsoft.AspNetCore.App 2.1.11 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.1.25 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.1.28 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.2.5 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 2.2.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.1 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.12 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 3.1.17 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 5.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 5.0.4 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.AspNetCore.App 5.0.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.AspNetCore.App] Microsoft.NETCore.App 2.1.11 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.1.25 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.1.28 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.2.5 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 2.2.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.1 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.12 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 3.1.17 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 5.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 5.0.4 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.NETCore.App 5.0.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.NETCore.App] Microsoft.WindowsDesktop.App 3.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 3.1.1 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 3.1.12 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 3.1.17 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 5.0.3 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 5.0.4 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] Microsoft.WindowsDesktop.App 5.0.8 [C:\PROGRAM FILES\DOTNET\shared\Microsoft.WindowsDesktop.App] - The IDE (VS / VS Code/ VS4Mac) you're running on, and its version Microsoft Visual Studio Community 2019 (Version 16.8.5)
Author: NavjotSinghMinhas
Assignees: -
Labels: `area-System.IO.Compression`, `untriaged`
Milestone: -
stephentoub commented 3 years ago

How did you create that compressed data you're trying to decompress?

stephentoub commented 3 years ago

I'm not sure why .NET Framework's GZipStream was decompressing the data, but it's not gzip format. It's zlib format. If you change GZipStream to ZLibStream in .NET 6, it'll decompress correctly.

using System.IO.Compression;

string byteString = "120,218,133,83,203,106,27,65,16,252,21,49,103,105,232,233,121,235,158,75,192,33,144,163,9,102,165,93,9,131,223,146,15,177,208,191,123,170,118,101,180,129,144,147,186,182,122,166,171,186,70,39,243,54,28,94,158,159,14,131,89,159,76,223,29,59,252,118,98,214,198,73,86,27,197,44,77,231,230,80,9,93,178,62,2,250,11,155,201,134,57,140,13,214,36,182,178,55,53,148,188,183,66,46,163,85,75,182,74,88,26,44,177,76,168,242,30,13,227,208,77,147,116,123,50,31,114,249,202,27,62,160,44,85,84,16,21,205,249,119,107,117,127,181,134,120,105,245,95,173,58,182,242,84,169,42,94,49,132,86,156,47,227,237,27,88,241,94,173,39,138,188,207,103,91,10,32,172,172,194,164,118,147,137,108,192,128,77,153,81,245,138,218,10,23,96,35,36,109,169,222,38,135,90,41,207,141,110,183,16,146,156,45,108,131,140,213,10,85,252,170,184,72,231,162,90,133,242,45,87,137,2,179,181,74,148,12,132,217,41,228,156,83,67,189,76,77,61,3,213,36,66,159,61,102,203,232,185,247,87,117,184,170,233,190,46,190,191,63,44,84,212,45,23,78,215,33,174,85,23,63,111,192,167,127,240,50,241,249,63,231,203,108,245,125,157,189,177,65,102,236,0,253,21,222,73,234,156,132,131,172,58,157,228,115,44,181,237,159,36,108,104,156,76,13,204,176,164,113,207,3,51,108,241,98,69,3,51,116,97,204,112,96,134,206,122,44,117,71,49,105,122,22,59,238,178,216,130,24,118,76,177,61,17,71,10,74,66,181,18,0,32,36,181,201,12,120,199,117,102,111,171,111,104,31,167,96,246,16,196,135,237,25,99,77,122,247,227,215,55,115,94,154,195,241,109,232,30,239,159,246,119,199,63,47,237,223,106,94,223,159,143,205,235,249,252,9,112,213,199,94,10";
string[] byteStringSplit = byteString.Split(',');

byte[] dataBytes = new byte[byteStringSplit.Length];
for (int i = 0; i < byteStringSplit.Length; i++)
    dataBytes[i] = Convert.ToByte(byteStringSplit[i]);

string result = new StreamReader(new ZLibStream(new MemoryStream(dataBytes), CompressionMode.Decompress)).ReadToEnd();
Console.WriteLine(result);
NavjotSinghMinhas commented 3 years ago

Hi Stephan,

I am getting the byte stream directly from a socket during share market hours. (Unfortunately, I can't share that directly, therefore I used string to byte implementation for example).

I have been using GZipStream in .NET Framework for quite a while now and it works as expected. Also, any way to test ZLibStream in .NET 5 / .NET core 3.1?

Thanks

stephentoub commented 3 years ago

any way to test ZLibStream in .NET 5 / .NET core 3.1?

Not without copying the code into your app.

I have been using GZipStream in .NET Framework for quite a while now and it works as expected.

I debugged through it.

@dotnet/area-system-io-compression, this change: https://github.com/dotnet/corefx/commit/185fac5aa8fdc8a6b95cae22473e14fd68d04dc4#diff-f3f8f1fd33c298f06b753ef732eee67acd1b7deead6c55838a64901e7432bb95L116-L128 removed this implicit gzip/zlib header detection/decoding from GZipStream. We should decide whether at this point that was a bug fix (this class is about gzip, and the original managed implementation only supported gzip) and we're happy with it only being for gzip as it is today: https://github.com/dotnet/runtime/blob/e063533eb79eace045f43b41980cbed21c8d7365/src/libraries/System.IO.Compression/src/System/IO/Compression/GZipStream.cs#L20 or whether we should treat this as a regression and fix it to use 47 instead of 15 for inflating.

stephentoub commented 3 years ago

Also cc @ianhays who added this in https://github.com/dotnet/corefx/pull/2906 in 2015 in case he remembers why he chose this approach, and @krwq who ported it back to .NET Framework in 2017.

ianhays commented 3 years ago

Howdy! If I recall correctly, I chose 31 instead of 47 for the headerbits for inflation to match how the .NET Core managed zlib implementation handled the gzipstream with invalid/missing header case. I think I added some tests for it too, but I don't specifically remember running those tests against the "full framework".

31 is more explicitly correct and informs users if their data may be jacked up in other, less noticeable ways, but 47 is more forgivable, particularly when people mix up GzipStream and DeflateStream.

stephentoub commented 3 years ago

Thanks, Ian :)

stephentoub commented 1 year ago

@ericstj, this is the same conversation we were having the other day...

ericstj commented 1 year ago

So the problem here is that on .NETFramework GZipStream would support decompressing both ZLIB and GZip payloads. On .NETCore we force folks to choose, and if they don't have anything but the payload to make that choice getting it right can be hard.

Possible solutions are:

  1. A new type to represent this scenario.
  2. Modify an existing type to handle this scenario by default.
  3. Modify an existing type to handle this scenario through opt-in API.
  4. Variation of 3 that does this through an AppContext switch instead of explicit API.

Some combinations of those options. @stephentoub did you have an opinion about the direction to take here?

stephentoub commented 1 year ago

I'd be inclined to go with (2), possibly with an AppContext switch opt-out, either just for GZipStream (the argument being this is how it used to work by accident and we're just realigning with that), or for both GZipStream and ZLibStream to decompress either (the argument being that they're both deflate wrappers and for decompression they should behave similarly).

ericstj commented 11 months ago

@stephentoub are you still leaning towards option 2 here? Folks are still reporting this issue as a compatibility problem between core and framework. It does seem like a scenario we should try to support better (given its hard to implement on top).

stephentoub commented 11 months ago

are you still leaning towards option 2 here

Yes. Let's just make GZipStream support reading zlib, optionally with an opt-out appctx switch if we're concerned. We should also consider backporting to 8.

stephentoub commented 11 months ago

@ericstj, this actually may be more complicated. I just tried it (changing the GZipStream ctor to use 47 instead of 31 when the mode is decompression), and we get a few failures and asserts, specifically from the ManyConcatenatedGzipStreams test. That suggests zlib might end up requesting data differently in this mode and there could be fallout from such a change. Someone will need to dig deeper.

https://github.com/dotnet/runtime/compare/main...stephentoub:runtime:gzipdecompresszlib