adamhathcock / sharpcompress

SharpCompress is a fully managed C# library to deal with many compression types and formats.
MIT License
2.25k stars 479 forks source link

LZMA ZIP Extraction doesn't handle zero-byte files #142

Open leezer3 opened 8 years ago

leezer3 commented 8 years ago

When extracting all files in an archive, the LZMA extractor falls over when it encounters a zero-byte file. This code will trigger it, assuming you've got a zero-byte file in the archive:

var reader = ArchiveFactory.Open(stream);
foreach (var archiveEntry in reader.Entries)
{
   archiveEntry.WriteToDirectory(extractionDirectory, ExtractOptions.ExtractFullPath | ExtractOptions.Overwrite);
}

An exception is triggered here: https://github.com/adamhathcock/sharpcompress/blob/master/SharpCompress/Compressor/LZMA/LzmaStream.cs#L219

Adding this simple check to the extraction process works around the problem:

if (archiveEntry.Size == 0)
{
    //Skip zero-byte files
    continue;
}

Suggests to me that you're assuming a minimum size for the file, and attempting to read that far in the stream, even though all you should be reading is the header and skipping over?

weltkante commented 8 years ago

@leezer3 Is this fixed? I made an archive with some files + an empty text file and it didn't trigger any exception, the empty text file unpacked fine along the other files.

leezer3 commented 8 years ago

Nope, still throws a fit (Test.txt is zero-bytes, test.jpg is just a filler)

Test program:

using System.IO;
using SharpCompress.Archive;
using SharpCompress.Common;
using SharpCompress.Writer;

namespace ConsoleApplication4
{
    class Program
    {
        static void Main(string[] args)
        {
            using (var WriteStream = File.OpenWrite("C:\\test\\test.zip"))
            {
                using (var zipWriter = WriterFactory.Open(WriteStream, ArchiveType.Zip, CompressionType.LZMA))
                {
                    zipWriter.Write("test.txt", "C:\\test\\test.txt");
                    zipWriter.Write("test.jpg", "C:\\test\\test.jpg");
                }
            }

            using (Stream stream = File.OpenRead("C:\\test\\test.zip"))
            {

                var reader = ArchiveFactory.Open(stream);
                foreach (var archiveEntry in reader.Entries)
                {
                    archiveEntry.WriteToDirectory("C:\\test\\output\\", ExtractOptions.ExtractFullPath | ExtractOptions.Overwrite);
                }
            }
        }
    }
}

Archive created: test.zip

leezer3 commented 8 years ago

Thought- It's possible that the creator is actually at fault here?

I'll admit not having tested files created outside Sharpcompress, but I discounted it as they extracted fine with WinRAR.

weltkante commented 8 years ago

Oh, interesting, I was testing with 7z archives (created with external application) and not zip archives since the title only said 'lzma'. Might take another look tomorrow but I'm more used to the internals of 7z/lzma than the zip version.

leezer3 commented 8 years ago

Probably should have been a little clearer, so updated the title.....

Adding this at line 218 would appear to fix the issue: https://github.com/adamhathcock/sharpcompress/blob/master/src/SharpCompress/Compressor/LZMA/LzmaStream.cs#L218 if(inputSize ==5) return total

(A ZIP file header is 4 bytes long, so if our file is of zero length, the next file header will start at +5 bytes)

Don't know if that's just cosmetically hiding a deeper problem though..... I don't really know enough about the structure of zip files to be happy making a change like this off my own bat though.