adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.25k stars 479 forks source link

SharpCompress hangs in busy wait reading corrupt/truncated ".tar.gz" file #165

Open erikcturner opened 8 years ago

erikcturner commented 8 years ago

I used one of the samples (reproduced below) and the code fragment hangs on the final MoveToNextEntry call. This same code works fine on a non-truncated .tar.gz file.

using (Stream stream = File.OpenRead(@"C:\Code\sharpcompress.tar.gz"))
{
    var reader = ReaderFactory.Open(stream);
    while (reader.MoveToNextEntry())
    {
        if (!reader.Entry.IsDirectory)
        {
            reader.WriteEntryToDirectory(@"C:\temp", ExtractOptions.ExtractFullPath |  ExtractOptions.Overwrite);
        }
    }
}
adamhathcock commented 7 years ago

Is it supposed to be a valid file? SharpCompress isn't going to check for file validity.

erikcturner commented 7 years ago

Adam,

We were generating and downloading a ".tar.gz" file from one of our servers. Due to a bug in the software, it was returning a shortened version of the complete ".tar.gz" file.

SharpCompress did not throw an exception under this condition - it just never returned and seemed to be in a "busy wait" since it was consuming an entire core's worth of CPU time.

This behavior was unacceptable for our software in the web server so I ended up writing my own Deflate/Untar functionality that threw an exception when it detected then condition of "no more data available" and "not done with TAR entry". I found out that Deflate has no indication that the compressed stream is too short but TAR can recognize a tarball that is too short (unless the truncation happens to fall exactly at the intersection between the end of one file and the TAR header at the beginning of the next file - and even then it would recognize it as a non-standard TAR file without the block of 512 zeros at the end).

I submitted the problem report because I thought you might want to know about the "busy wait" issue - it totally killed the performance of our 12 core server after enough attempts to download truncated ".tar.gz" files.

Erik Turner

On Tue, Sep 27, 2016 at 5:55 AM, Adam Hathcock notifications@github.com wrote:

Is it supposed to be a valid file? SharpCompress isn't going to check for file validity.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/adamhathcock/sharpcompress/issues/165#issuecomment-249819968, or mute the thread https://github.com/notifications/unsubscribe-auth/AOeJPLQQGpq9x5lCg0NZMf5HK83YDxHoks5quOgFgaJpZM4Ju_kJ .

adamhathcock commented 7 years ago

Is there any way you can contribute the code you wrote?